WGLab / DeepMod

DeepMod: a deep-learning tool for genomic-scale, strand-sensitive and single-nucleotide based detection of DNA modifications
Other
99 stars 35 forks source link

No Fastq data in fast5 #32

Open laguda-1996 opened 4 years ago

laguda-1996 commented 4 years ago

Hello, I have the same problem when I use it. This is my fast5 file, the result of viewing with h5ls - r / Group /Analyses Group /Analyses/Basecall_1D_000 Group /Analyses/Basecall_1D_000/BaseCalled_template Group /Analyses/Basecall_1D_000/BaseCalled_template/Events Dataset {1561} /Analyses/Basecall_1D_000/BaseCalled_template/Fastq Dataset {SCALAR} /Analyses/Basecall_1D_000/Configuration Group /Analyses/Basecall_1D_000/Configuration/basecall_1d Group /Analyses/Basecall_1D_000/Summary Group /Analyses/Basecall_1D_000/Summary/basecall_1d_template Group /Analyses/Calibration_Strand_Detection_000 Group /Analyses/Calibration_Strand_Detection_000/Configuration Group /Analyses/Calibration_Strand_Detection_000/Configuration/calib_detector Group /Analyses/Calibration_Strand_Detection_000/Summary Group /Analyses/Calibration_Strand_Detection_000/Summary/calibration_strand_template Group /Analyses/Segmentation_000 Group /Analyses/Segmentation_000/Configuration Group /Analyses/Segmentation_000/Configuration/stall_removal Group /Analyses/Segmentation_000/Summary Group /Analyses/Segmentation_000/Summary/segmentation Group /PreviousReadInfo Group /Raw Group /Raw/Reads Group /Raw/Reads/Read_6 Group /Raw/Reads/Read_6/Signal Dataset {23415/Inf} /UniqueGlobalKey Group /UniqueGlobalKey/channel_id Group /UniqueGlobalKey/context_tags Group /UniqueGlobalKey/tracking_id Group

What should I do? I look forward to your reply

laguda-1996 commented 4 years ago

Display at terminal:

Nanopore sequencing data analysis is resourece-intensive and time consuming. Some potential strong recommendations are below: If your reference genome is large as human genome and your Nanopore data is huge, It would be faster to run this program parallelly to speed up. You might run different input folders of your fast5 files and give different output names (--FileID) or folders (--outFolder) A good way for this is to run different chromosome individually.

         Current directory: /home/dsy/software/DeepMod
                  outLevel: 2
                   wrkBase: /home/dsy/2T_WD/深度学习_识别m6a/数据/m6a/RNAAB090763.fast5.tar.gz/fast5/1204665-1.fast5
                    FileID: m6a
                 outFolder: 123/
                 recursive: 1
          files_per_thread: 1000
                   threads: 6
                windowsize: 21
                  alignStr: minimap2
               SignalGroup: simple
                      move: False
               basecall_1d: Basecall_1D_000
          basecall_2strand: BaseCalled_template
                    ConUnk: True
               outputlayer: 
                      Base: A
               mod_cluster: 0
                   predDet: 1
                       Ref: /home/dsy/2T_WD/深度学习_识别m6a/数据/reference_fasta/jiaomu.fna
                      fnum: 7
                    hidden: 100
                   modfile: train_mod/rnn_conmodA_P100wd21_f7ne1u0_4/mod_train_conmodA_P100wd21_f3ne1u0
                    region: [[None, None, None]]

/home/dsy/anaconda3/envs/mdeepmod/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/dsy/anaconda3/envs/mdeepmod/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/dsy/anaconda3/envs/mdeepmod/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/dsy/anaconda3/envs/mdeepmod/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/dsy/anaconda3/envs/mdeepmod/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:521: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/dsy/anaconda3/envs/mdeepmod/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) WARNING:tensorflow:From /home/dsy/anaconda3/envs/mdeepmod/lib/python3.6/site-packages/tensorflow/contrib/learn/python/learn/datasets/base.py:198: retry (from tensorflow.contrib.learn.python.learn.datasets.base) is deprecated and will be removed in a future version. Instructions for updating: Use the retry module or similar alternatives. Total files=7999 2020-04-21 22:26:13.699284: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA 2020-04-21 22:26:13.924880: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA 2020-04-21 22:26:14.099183: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA 2020-04-21 22:26:14.369383: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA 2020-04-21 22:26:14.648059: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA 2020-04-21 22:26:15.275578: I tensorflow/core/platform/cpu_feature_guard.cc:140] Your CPU supports instructions that this TensorFlow binary was not compiled to use: SSE4.1 SSE4.2 AVX AVX2 FMA Error!!! No Fastq data in /home/dsy/2T_WD/深度学习_识别m6a/数据/m6a/RNAAB090763.fast5.tar.gz/fast5/1204665-1.fast5/GXB01170_20180726_FAH84534_GA20000_sequencing_run_RNAAB090763_92733_read_396_ch_503_strand.fast5 /home/dsy/software/DeepMod/bin/scripts/myDetect.py:157: H5pyDeprecationWarning: dataset.value has been deprecated. Use dataset[()] instead. events_data = sp_param['f5reader'][event_path].value Error!!! No Fastq data in /home/dsy/2T_WD/深度学习_识别m6a/数据/m6a/RNAAB090763.fast5.tar.gz/fast5/1204665-1.fast5/GXB01170_20180726_FAH84534_GA20000_mux_scan_RNAAB090763_45082_read_10_ch_76_strand.fast5 /home/dsy/software/DeepMod/bin/scripts/myDetect.py:157: H5pyDeprecationWarning: dataset.value has been deprecated. Use dataset[()] instead. events_data = sp_param['f5reader'][event_path].value Error!!! No Fastq data in /home/dsy/2T_WD/深度学习_识别m6a/数据/m6a/RNAAB090763.fast5.tar.gz/fast5/1204665-1.fast5/GXB01170_20180726_FAH84534_GA20000_mux_scan_RNAAB090763_45082_read_12_ch_331_strand.fast5 /home/dsy/software/DeepMod/bin/scripts/myDetect.py:157: H5pyDeprecationWarning: dataset.value has been deprecated. Use dataset[()] instead. events_data = sp_param['f5reader'][event_path].value /home/dsy/software/DeepMod/bin/scripts/myDetect.py:157: H5pyDeprecationWarning: dataset.value has been deprecated. Use dataset[()] instead. events_data = sp_param['f5reader'][event_path].value Error!!! No Fastq data in /home/dsy/2T_WD/深度学习_识别m6a/数据/m6a/RNAAB090763.fast5.tar.

liuqianhn commented 4 years ago

Hi @laguda-1996 , thanks for using DeepMod. When you use h5ls -r /home/dsy/2T_WD/深度学习_识别m6a/数据/m6a/RNAAB090763.fast5.tar.gz/fast5/1204665-1.fast5/GXB01170_20180726_FAH84534_GA20000_mux_scan_RNAAB090763_45082_read_12_ch_331_strand.fast5, can you find /Analyses/Basecall_1D_000/BaseCalled_template/Fastq?

laguda-1996 commented 4 years ago

Thank you for your reply Yes, and I can extract the fastq sequence with Python

liuqianhn commented 4 years ago

@laguda-1996 , if you can get fastq sequences, I do not see other potential issues here. You can share the fast5 with me so that I can test it directly before I can suggest a solution here.

laguda-1996 commented 4 years ago

This is a file in the folder GXB01170_20180726_FAH84534_GA20000_sequencing_run_RNAAB090763_92733_read_998_ch_393_strand.zip

chenchen-eng commented 4 years ago

Thank you for your reply Yes, and I can extract the fastq sequence with Python

Teacher, can you share some Python scripts?I have the same problem