OSU-BMBL / DESSO

A DL-based framework for sequence and shape motif identification in the human genome
https://desso.osubmi.org/
9 stars 9 forks source link

Predict DNA motifs from example fox01_peaks #5

Open kanglizhu opened 2 years ago

kanglizhu commented 2 years ago

Hello,

I want to predict the TF binding sites using your example peak file. The reference genome is hg38, and the bed file is fox01_peaks.bed. After processing peak file using command "python processing_peaks.py --name fox01_peaks.bed --peak_flank 50", I started to train the model. but i got an error message below.

python train.py --start_index 0 --end_index 1 --peak_flank 50 --network CNN --feature_format Seq Sequence length: 101 Dataset: fox01_peaks_encode Traceback (most recent call last): File "train.py", line 163, in <module> tf.app.run() File "/home/zhenyingLab/zhukangli/.conda/envs/desso/lib/python2.7/site-packages/tensorflow/python/platform/app.py", line 48, in run _sys.exit(main(_sys.argv[:1] + flags_passthrough)) File "train.py", line 68, in main util.load_data_encode(PATH_ENCODE + "/" + train_data_name + "_AC.seq.gz", peak_coor, train_data_name, path_curr_data, "train", DNA_SHAPE, peak_flank, back_grou) File "/storage/zhenyingLab/zhukangli/software/DESSO-master/data/libs/util.py", line 138, in load_data_encode sequences = np.concatenate((sequences, seq_shuffle), axis = 0) ValueError: all the input array dimensions except for the concatenation axis must match exactly Any help or guidance would be greatly appreciated!

Thank you!

kangli zhu

viyjy commented 2 years ago

Hi Kangli,

Thanks for your interest in your paper. Can you provide more details of your data? For example, what does the processed data like?

Thanks.

kanglizhu commented 2 years ago

processed_peaks.zip Hi, Thank you for your reply. I have uploaded my processed data. kangli

viyjy commented 2 years ago

Get it. Thanks.

viyjy commented 2 years ago

@Wang-Cankun Cankun, do you have any suggestions? Did you add this file processing_peaks.py? Thanks.