Closed ektapathak08 closed 2 years ago
It seems you select setting as new instead of test, so you need to provide ground truth label. This setting is used to benchmark performance. So please choice parameter setting as “test”. Code shown in main.py discribe the meaning of each parameter.
thanks.. the run started but now I am having RuntimeError.
Any suggestions for that?
RuntimeError: CUDA out of memory. Tried to allocate 2.16 GiB (GPU 0; 7.79 GiB total capacity; 4.37 GiB already allocated; 1.13 GiB free; 4.38 GiB reserved in total by PyTorch)
The GPU is out-of-memory. You can decrease batch size, select less gene, or use cpu to train(may be too slow).
thanks for your prompt response. I changed batch size to 32,16 and 8 but nothing worked. python main.py --task celltype_GRN --data_file counts_normazile_log_transformed.csv --setting test --batch_size 16 --save_name out1
I have 8 GB Qudro RTX4000 GPU, RAM of 94 GB.
Could you please tell me how to use the CPU to train the model? What should be the command or changes in the script to set CPU for model training?
You can delete all ".cuda()" and ".cuda" in file src/DeepSEM_cell_type_test_specific_GRN_model.py and src/Model.py then the model will train by using CPU. for example file DeepSEM_cell_type_test_specific_GRN_model.py line 16,67 Tensor = torch.cuda.FloatTensor->Tensor = torch.FloatTensor line 66 vae = VAE_EAD(adj_A_init, 1, self.opt.n_hidden, self.opt.K).float().cuda()-> vae = VAE_EAD(adj_A_init, 1, self.opt.n_hidden, self.opt.K).float() line 84: loss, loss_rec, loss_gauss, loss_cat, dec, y, hidden = vae(inputs, dropout_mask=dropout_mask.cuda(), -> loss, loss_rec, loss_gauss, loss_cat, dec, y, hidden = vae(inputs, dropout_mask=dropout_mask
file src/Model.py line 8: Tensor = torch.cuda.FloatTensor->torch.FloatTensor line 200:torch.log(torch.FloatTensor([2.0 * np.pi])).sum(0) + torch.log(var) + torch.pow(x - mu, 2) / var, dim=-1)
Sorry for the inconvenience. Note that it might be quite slow when use cpu to train the model.
Thanks for your response. I'll post again once the job is done so that others can also learn from my issue.
Hi! As you suggested that using cpu will take lot of time, mu 8 core machine took more than 24 hrs to complete 2 epochs. And there are 120 epochs in default settings. I terminated the run.
will it be good idea to use DEGs only as input? What should be criterion for subsetting gene? In my dataset , I have around 20k cells. Please suggest.
As recommend by BEELINE (https://doi.org/10.1038/s41592-019-0690-6) and also stated in our paper discussion section, I strongly recommend you to select DEG for example(500 or 1000) and TF as input. Note that the memory cost of W in GRN layer is n^2 and the time cost of inverse operation is n^3, so I recommend that the total number of select genes should be less 2000 (may be you can run with gpu in this setting).
And note that the benchmark dataset only contain less than 1.5k cells, so the number of epoch can be decreased in your experiment or the number of training set in each epoch can be decrease. For example, you can keep the n_epoch =120 or 150, but only random sample about 1500 cell in each epoch.
Another suggestion is that you can run DeepSEM for multiple times and ensemble the result. In this way you can get
stable result.
Any further questions are welcome.
Thanks for your quick response. I will implement your suggestion and will let you know.
thanks again Ekta
I am running the GRN inference step but getting the FileNotFoundError: [Errno 2]. it is not clear that which file is not found. Please help.
I have installed all the required packages using pip
Thanks in advance Ekta
python main.py --task celltype_GRN --data_file counts_normazile_log_transformed.csv --setting new --alpha 100 --beta 1 --n_epoch 90 --save_name out1 Traceback (most recent call last): File "main.py", line 77, in
model.train_model()
File "/home/ekta/Documents/1_ekta_research_work/covid_Pancreas/Qc_doublet_removal/res_4_5/analysis/anticipated_revision/DeepSEM-master/src/DeepSEM_cell_type_specific_GRN_model.py", line 80, in train_model
dataloader, Evaluate_Mask, num_nodes, num_genes, data, truth_edges, TFmask2, gene_name = self.init_data()
File "/home/ekta/Documents/1_ekta_research_work/covid_Pancreas/Qc_doublet_removal/res_4_5/analysis/anticipated_revision/DeepSEM-master/src/DeepSEM_cell_type_specific_GRN_model.py", line 32, in init_data
Ground_Truth = pd.read_csv(self.opt.net_file, header=0)
File "/home/ekta/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 686, in read_csv
return _read(filepath_or_buffer, kwds)
File "/home/ekta/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 452, in _read
parser = TextFileReader(fp_or_buf, kwds)
File "/home/ekta/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 936, in init
self._make_engine(self.engine)
File "/home/ekta/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 1168, in _make_engine
self._engine = CParserWrapper(self.f, self.options)
File "/home/ekta/anaconda3/lib/python3.8/site-packages/pandas/io/parsers.py", line 1998, in init
self._reader = parsers.TextReader(src, **kwds)
File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.cinit
File "pandas/_libs/parsers.pyx", line 674, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] No such file or directory: ''