TencentAILabHealthcare / scTranslator

31 stars 6 forks source link

Error encountered when trying to run your demo code #4

Closed CCCC1800 closed 4 months ago

CCCC1800 commented 8 months ago

Hi, I tried to run your demo code as provided on this GitHub repo, but encountered some errors for step 6's code: # Inferrence without fine-tune $ python code/stage3_inference_without_finetune.py \ --pretrain_checkpoint='checkpoint/stage2_single-cell_scTranslator.pt' \ --RNA_path='dataset/test/dataset1/GSM5008737_RNA_finetune_withcelltype.h5ad' \ --Pro_path='dataset/test/dataset1/GSM5008738_protein_finetune_withcelltype.h5ad'

When I run this line, it returns me the error "NameError: name 'fix_SCDataset' is not defined". I checked your model code, it seems that both functions "fix_SCDataset" and "SCDataset" are not defined. Could you fix this bug and update a new code which can be run without error?

Thank you.

ElaineLIU-920 commented 8 months ago

Hi, thank you for using and helping us to test scTranslator.

I have carefully re-tested the provided code, and it can be successfully executed from start to finish. Furthermore, please note that the definitions of the classes 'fix_SCDataset' and 'SCDataset' are provided in the 'code/model/utils.py' file. To import the 'SCDataset' class from 'code/model/utils.py', you may utilize the following method:

First, please make sure that the 'code' directory is in your Python path. You can add it to the Python path by modifying the 'sys.path' variable. Then, you can use the 'from ... import ...' statement to import the 'SCDataset' class at the beginning of 'stage3_inference_without_finetune.py' (We actually did this by *'from utils import '**).

Here's an example:

import sys
import os
# Add the 'code' directory to the Python path
code_dir = os.path.abspath("code")
sys.path.append(code_dir)
# Import the SCDataset class from code/model/utils.py
from model.utils import SCDataset
# Now you can use the SCDataset class

We recommend that you strictly follow steps 1-5 before proceeding to step 6. I hope that my answer is helpful to you. If you have any further questions or need assistance, please don't hesitate to reach out.

Thanks again for using scTranslator!

CCCC1800 commented 8 months ago

Thank you for your reply. It turned out to be that the "sys.path.append('code/model')" code line somehow doesn't work on my system. I revised a bit on that line, as well as the code file directories and it works on my system now.

Also I was wondering if the scTranslator can be run without step 1 & step 2 where the "docker" command is used? I tried on both my personal computer (a typical Mac Pro computer) and a larger public server (with 16 CPUs), but the step 1 always stops in the middle, returning an error "No space left on device".

I'm pasting the error below, this is by running on a 16-CPU public server, which I typically use for running any other models without returning error:

(scTranslator) ➜ ~ docker pull linjingliu/sctranslator:latest latest: Pulling from linjingliu/sctranslator ee26583b0fd0: Pull complete 45fbc4938e40: Pull complete 68490db51239: Pull complete 00acb7981775: Extracting [==================================================>] 71.64MB/71.64MB 7b8bc545cfd1: Download complete 0aea654bb889: Download complete 3f587a40f146: Download complete f06d9e26a1eb: Download complete 22b71264d93e: Download complete 9034e129c5bc: Download complete 323477e758fa: Download complete 4bbc6e0c8754: Download complete 2863c86c2c19: Download complete e2af464cc6f6: Download complete 375a20054367: Downloading [==================================================>] 5.516GB/5.516GB write /var/snap/docker/common/var-lib-docker/tmp/GetImageBlob2797893160: no space left on device

I think it may be impractical to follow your step 1 & 2 for most users who don't have access to a system with extremely large CPU/GPU memories. Currently I just skip step1-2, starting only from step 3 and it seems that the code can still work (I installed the required python packages in the Conda environment, although I'm not sure what exact package versions to use). Could it be possible that you provide an alternative for running the scTranslator without the docker command which requires a huge CPU/GPU memory? For example, could you specify the required dependency packages with exact versions for users to install by themselves in a Conda environment, without using docker command?

Thank you.

ElaineLIU-920 commented 8 months ago

Hi! It is gratifying to learn that you were able to rectify the issue pertaining to the "sys.path.append('code/model')" line.

Thank you for your feedback. While we've provided a Docker image that includes all necessary environment for user convenience, it's entirely possible to run scTranslator without Docker. To assist with this, we've updated our latest README to include a list of required dependencies and their corresponding versions, along with a guideline for the environment preparation. We have also enhanced the code to work successfully on CPU machines, though the fine-tuning phase still necessitates execution on a GPU. We have tested the updated scTranslator on CPU machines based on CentOS, and on GPU machines based on Ubuntu and CentOS. It is also possible to run all our provided demos even with only 16G of video memory.

I hope this information is helpful. If you have any other questions or need further clarification, please don't hesitate to let us know.

Thank you!

CCCC1800 commented 8 months ago

Thank you for adding the required dependency file. While I can run through your code, there's another issue about scTranslator's output files. How to obtain the predicted protein expression values? It seems that your code only outputs a file of a loss value, but I don't find any output on the predicted protein expression values. Could you add the code for obtaining the predicted protein expression values (showing the corresponding protein names and cell IDs)?

Also, besides the predicted protein expression values, could you add the code for obtaining the ground truth protein expression values (I mean after normalization, if needed. Because I want to compare the ground truth values with the predicted values for each protein/cell) as well? I want to compare each protein/cell between scTranslator's predicted values and their corresponding ground truth values.

Thank you.

ElaineLIU-920 commented 8 months ago

Hi! Thank you for presenting this user requirement. We have updated the code to address your concerns regarding the output files. The predicted protein expression values, the true protein expression values (post-normalization), along with the corresponding protein names and cell IDs, are now available in a '.csv' format within the specified folder.

We appreciate your valuable feedback and hope that these updates meet your expectations. Should you have any further inquiries or concerns, please do not hesitate to contact us. Enjoy using scTranslator!

CCCC1800 commented 8 months ago

Thank you for your update!