scvae / scvae

Deep learning for single-cell transcript counts
Apache License 2.0
81 stars 26 forks source link

Error while performing evaluation #14

Closed gprashant17 closed 4 years ago

gprashant17 commented 4 years ago

I used the following code for training the model:

$ scvae train "/content/drive/My Drive/Datasets/SRA779509_SRS3805247.json" --split-data-set --splitting-method random --splitting-fraction 0.8 -l 10 -H 200 100 -w 5 -e 5 -M "/content/drive/My Drive/Datasets/model" -r zero_inflated_negative_binomial -q unit_variance_gaussian

This created a folder in the specified directory that contained all the log output and checkpoint files and tfevents files for the training and validation sets.

However, upon running the following code for evaluating the model,

$ scvae evaluate "/content/drive/My Drive/Datasets/SRA779509_SRS3805247.json" --analyses-directory "/content/drive/My Drive/Datasets/analyses" --prediction-method kmeans --decomposition-methods tSNE --split-data-set -l 10 -H 200 100 -w 5 --splitting-fraction 0.8 I got this exception (after splitting of dataset):

...
Model
═════

tcmalloc: large alloc 1177214976 bytes == 0xb302a000 @  0x7f184e3f11e7 0x7f184bed75e1 0x7f184bf3bc78 0x7f184bf3edb8 0x7f184bf3f395 0x7f184bfd665d 0x50a635 0x50bfb4 0x507d64 0x509a90 0x50a48d 0x50bfb4 0x507d64 0x509a90 0x50a48d 0x50cd96 0x507d64 0x509a90 0x50a48d 0x50cd96 0x507d64 0x509a90 0x50a48d 0x50cd96 0x509758 0x50a48d 0x50bfb4 0x507d64 0x509a90 0x50a48d 0x50cd96
/usr/local/lib/python3.6/dist-packages/statsmodels/tools/_testing.py:19: FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.
  import pandas.util.testing as tm
Traceback (most recent call last):
  File "/usr/local/bin/scvae", line 8, in <module>
    sys.exit(main())
  File "/usr/local/lib/python3.6/dist-packages/scvae/cli.py", line 1225, in main
    status = arguments.func(**vars(arguments))
  File "/usr/local/lib/python3.6/dist-packages/scvae/cli.py", line 416, in evaluate
    raise Exception("Cannot analyse model when it has not been trained.")
Exception: Cannot analyse model when it has not been trained.

I would like to know how do we ensure that the model that we trained in the first command is used for evaluation. Do we have to specify any other directory where the model is saved in during evaluation?

Thanks!

chgroenbech commented 4 years ago

Yes, you also need to specify the model directory for the evaluate command using the -M option.

In the next version, I have included the model directory path in the error message, so it should be easier to find which options are missing. In doing that, I noticed that you are also missing the -r and -q options from your training command. So if you add the following to your evaluation command, you should be able evaluate your model:

-M "/content/drive/My Drive/Datasets/model" -r zero_inflated_negative_binomial -q unit_variance_gaussian

Remember to specify the number of clusters using the -K option for the k-means clustering, if labels are not included in the data set.

In the future, I will make it easier to specify a trained model.

gprashant17 commented 4 years ago

Hi, thanks for your reply. I am actually using Google Colab to execute the commands and it is apparently case sensitive to the path names. In my case, the folder SRA779509_SRS3805247is accessed by sra779509_srs3805247/ and this is the reason the error showed up. I changed the folder name accordingly and followed your suggestions. It is working now!

wenyuhaokikika commented 1 year ago

Hello, are there any models that have been trained using GTEx, TCGA or XENA?