kimmo1019 / scDEC

Simultaneous deep generative modeling and clustering of single cell genomic data
MIT License
30 stars 6 forks source link

Question about forebrain dataset reproduction #1

Open Linshiqi-Git opened 2 years ago

Linshiqi-Git commented 2 years ago

Hi, scDEC is a very interesting and useful tool. But I have some confusions when reproducing forebrain results. my parameters:

outputs: Loading Pre-trained Model... INFO:tensorflow:Restoring parameters from ./scDEC/pre_trained_models/Forerain/model.ckpt-best scDEC: NMI = 0.46680481742184277, ARI = 0.3193782287094836, Homogeneity = 0.46465900691199974

And I run "python eval.py --data Forebrain --timestamp 20211222_105222 --train True" for clustering results. But the t-SNE plot shows that three subtypes of excitatory neuron cells (EX1, EX2 and EX3) are not as well separated clearly as in the article. Could you give me some suggestions that why I can’t reproduce the results in your paper? Did I use the wrong parameters ? And what’s your parameters? I’ll be appreciated if you can help me.Thank you very much! @kimmo1019

kimmo1019 commented 2 years ago

Hi, scDEC is a very interesting and useful tool. But I have some confusions when reproducing forebrain results. my parameters:

  • data = 'Forebrain'
  • model = importlib.import_module("model")
  • nb_classes = 8
  • x_dim = 7
  • y_dim = 20
  • batch_size = 64
  • nb_batches = 50000
  • alpha = 10.0
  • beta = 10.0
  • ratio = 0.2
  • low = 0.03
  • timestamp = ''
  • is_train = False
  • has_label = True
  • mode = 1

outputs: Loading Pre-trained Model... INFO:tensorflow:Restoring parameters from ./scDEC/pre_trained_models/Forerain/model.ckpt-best scDEC: NMI = 0.46680481742184277, ARI = 0.3193782287094836, Homogeneity = 0.46465900691199974

And I run "python eval.py --data Forebrain --timestamp 20211222_105222 --train True" for clustering results. But the t-SNE plot shows that three subtypes of excitatory neuron cells (EX1, EX2 and EX3) are not as well separated clearly as in the article. Could you give me some suggestions that why I can’t reproduce the results in your paper? Did I use the wrong parameters ? And what’s your parameters? I’ll be appreciated if you can help me.Thank you very much! @kimmo1019

Sorry for the late reply, just returned from holiday. Could you tell me your environment settings, such as python, TensorFlow version. Thanks!

Linshiqi-Git commented 2 years ago

My environment: Python 3.7.10 tensorboard 1.15.0 tensorflow 1.15.4 tensorflow-estimator 1.15.1 scikit-learn 1.0.1 pandas 1.3.4 scipy 1.7.1 sklearn 0.0 scanpy 1.8.2 anndata 0.7.6 keras 2.3.1 matplotlib 3.4.3 numpy 1.19.5 torch 1.10.0

Thanks!

kimmo1019 commented 2 years ago

Please check the dependencies for scDEC (here). scDEC was developed with python2.7 and tensorflow 1.13.1, could you please have a try? I recommend using a virtual environment for testing scDEC.