ma-compbio / SNIPER

nuclear compartments, subcompartments, nuclear organization, Hi-C, autoencoder
MIT License
29 stars 11 forks source link

Questions about pre-computed SNIPER models #17

Open QianzhaoJ opened 3 years ago

QianzhaoJ commented 3 years ago

Hi Thanks to develop so wonderful tool, but I had some problems when I apply SNIPER models to my hic data. When I have downloaded pre-computed SNIPER models, I found three kinds of files, autoencoder/encoder/classifier. I have no idea to select autoencoder or encoder, so I tried it all. When I used autoencoder file, I got error as follows

ValueError: Error when checking input: expected dense_17_input to have shape (128,) but got array with shape (13393,)

When I used `encoder` file, I got warnings as follows 

/dellstorage02/quj_lab/jiqianzhao/04_software/anaconda3/envs/SNIPER/lib/python3.6/site-packages/keras/engine/saving.py:292: UserWarning: No training configuration found in save file: the model was not compiled. Compile it manually

and its output bed files looks very strange. Its all "B3" subcompartments through genome. 捕获

There is my runing scripts :

odd_encoder=$model/odd_encoder.h5 odd_clf=$model/odd_classifier.h5 even_encoder=$model/even_encoder.h5 even_clf=$model/even_classifier.h5 python $sniper $hic $mat $odd_encoder $odd_clf $even_encoder $even_clf -jt $juicer -dd $tar/${sample}

I have no idea to deal with it, Could you give me some sugesstions?

Thanks in advance! Qianzhao

kairukuma commented 3 years ago

Hi, your script inputs are correct. I suspect the ValueError is because the autoencoder outputs a high-dimensional vector and the classifier expects a low-dimensional input.

The all-B3 outputs you're seeing are likely because either:

  1. The pre-computed model you selected expects much higher coverage than the hic file you've inputted.
  2. Your inter-chromosomal contact map is almost completely devoid of signal. There's a good chance SNIPER will predict B3 across the board because B3 signals are generally associated with a lack of inter-chromosomal signal. Of the cell types we've tested, the inter-chromosomal signals of non-B3 subcompartments were sparse but still distinguishable.
QianzhaoJ commented 3 years ago

Hi, Thanks for your kind reply, its very helpful for me. There are about 100 million interchromosomal contacts in my hic data, so I choose the pre-computed models from downsampled data (10%) , now I got the normal output. But I have a few questions about the percentage of downsampled data. If I select one pre-computed model from lower coverage than my hic data, would SNIPER predict more incorrect non-B3 subcompartments, more false positives? Looking forward your reply and thanks in advance!

Best, Qianzhao