Open 123chenshixin opened 3 years ago
@123chenshixin DeepMod expects something like below: /Analyses/Basecall_1D_000/BaseCalled_template/Events or /Analyses/Basecall_1D_000/BaseCalled_template/Move which your fast5 file do not have. You might need to re-basecall your data.
@liuqianhn My fast5 file is basecalled by sequencing company throught Albacore software.Is it necessary to use guppy for basecall other than Albacore?
@123chenshixin Albacore will generate Events and I used Albacore a lot. You might need to provide different options to tell Albacore to generate Events when you use Albacore. By default, Events will not be stored in fast5 files. In a fast5 file with Events, you will see something similar below:
/ Group
/Analyses Group
/Analyses/Basecall_1D_000 Group
/Analyses/Basecall_1D_000/BaseCalled_template Group
/Analyses/Basecall_1D_000/BaseCalled_template/Events Dataset {4555}
/Analyses/Basecall_1D_000/BaseCalled_template/Fastq Dataset {SCALAR}
The 5th row is for Events that are needed by DeepMod.
Hi, Qian,
Thanks to your kind reply, I understood that we need to re-basecall to generate "Events" Dataset, but unfortunately we stuck in this step for long time. I am wondering if there is a way that we can pass this step and get our data done. It would be greatly appreciated for those people, who got basecalled fast5 and fastq data directly from sequence company, like us. We also tried Deepsignal, it works just fine without "Events", but we think DeepMod has more advantages, so if there is anything that we can do to replace "Events", we would like to try them. Thanks a lot!
@123chenshixin Thank you very much for being interested in our tool. I checked h5ls you provided. It seems that Deepsignal also uses Events but uses Events from /Analyses/RawGenomeCorrected_001/BaseCalled_template/Events Dataset {29849}
, which is generated by Tombo and provides similar signal segmentation as basecalling but with different format. I do not have clear idea how Tombo works now. But I am afraid that it is not easy to incorporate this feature to our tool in a short time.
Hi, Qian, Thanks to your friendly reply! At the same time , thank you to pay your valuable time to try to solve my problems!
Hi, teacher
I'm trying to call DNA modifications by using deepmod. First I am successful to create virtual environment and install deepmod step by step through following the command lines you provided in your Installation Guide. Then I begin to call DNA modifications.
/ Group /read_0004d61f-0c92-4b10-ba33-5a5dac260f3a Group /read_0004d61f-0c92-4b10-ba33-5a5dac260f3a/Analyses Group /read_0004d61f-0c92-4b10-ba33-5a5dac260f3a/Analyses/Basecall_1D_000 Group /read_0004d61f-0c92-4b10-ba33-5a5dac260f3a/Analyses/Basecall_1D_000/BaseCalled_template Group /read_0004d61f-0c92-4b10-ba33-5a5dac260f3a/Analyses/Basecall_1D_000/BaseCalled_template/Fastq Dataset {SCALAR} /read_0004d61f-0c92-4b10-ba33-5a5dac260f3a/Analyses/Basecall_1D_000/Summary Group /read_0004d61f-0c92-4b10-ba33-5a5dac260f3a/Analyses/Basecall_1D_000/Summary/basecall_1d_template Group /read_0004d61f-0c92-4b10-ba33-5a5dac260f3a/Analyses/Segmentation_000 Group /read_0004d61f-0c92-4b10-ba33-5a5dac260f3a/Analyses/Segmentation_000/Summary Group /read_0004d61f-0c92-4b10-ba33-5a5dac260f3a/Analyses/Segmentation_000/Summary/segmentation Group /read_0004d61f-0c92-4b10-ba33-5a5dac260f3a/Raw Group /read_0004d61f-0c92-4b10-ba33-5a5dac260f3a/Raw/Signal Dataset {21089/Inf} /read_0004d61f-0c92-4b10-ba33-5a5dac260f3a/channel_id Group /read_0004d61f-0c92-4b10-ba33-5a5dac260f3a/context_tags Group /read_0004d61f-0c92-4b10-ba33-5a5dac260f3a/tracking_id Group /read_002a0d4c-d81c-400e-aa3d-56f980a361d0 Group /read_002a0d4c-d81c-400e-aa3d-56f980a361d0/Analyses Group /read_002a0d4c-d81c-400e-aa3d-56f980a361d0/Analyses/Basecall_1D_000 Group /read_002a0d4c-d81c-400e-aa3d-56f980a361d0/Analyses/Basecall_1D_000/BaseCalled_template Group /read_002a0d4c-d81c-400e-aa3d-56f980a361d0/Analyses/Basecall_1D_000/BaseCalled_template/Fastq Dataset {SCALAR} /read_002a0d4c-d81c-400e-aa3d-56f980a361d0/Analyses/Basecall_1D_000/Summary Group /read_002a0d4c-d81c-400e-aa3d-56f980a361d0/Analyses/Basecall_1D_000/Summary/basecall_1d_template Group /read_002a0d4c-d81c-400e-aa3d-56f980a361d0/Analyses/Segmentation_000 Group /read_002a0d4c-d81c-400e-aa3d-56f980a361d0/Analyses/Segmentation_000/Summary Group /read_002a0d4c-d81c-400e-aa3d-56f980a361d0/Analyses/Segmentation_000/Summary/segmentation Group /read_002a0d4c-d81c-400e-aa3d-56f980a361d0/Raw Group /read_002a0d4c-d81c-400e-aa3d-56f980a361d0/Raw/Signal Dataset {8036/Inf} /read_002a0d4c-d81c-400e-aa3d-56f980a361d0/channel_id Group /read_002a0d4c-d81c-400e-aa3d-56f980a361d0/context_tags Group, same as /read_0004d61f-0c92-4b10-ba33-5a5dac260f3a/context_tags /read_002a0d4c-d81c-400e-aa3d-56f980a361d0/tracking_id Group, same as /read_0004d61f-0c92-4b10-ba33-5a5dac260f3a/tracking_id /read_003f4b93-0399-434a-8643-da1dae43ab48 Group /read_003f4b93-0399-434a-8643-da1dae43ab48/Analyses Group /read_003f4b93-0399-434a-8643-da1dae43ab48/Analyses/Basecall_1D_000 Group /read_003f4b93-0399-434a-8643-da1dae43ab48/Analyses/Basecall_1D_000/BaseCalled_template Group /read_003f4b93-0399-434a-8643-da1dae43ab48/Analyses/Basecall_1D_000/BaseCalled_template/Fastq Dataset {SCALAR} /read_003f4b93-0399-434a-8643-da1dae43ab48/Analyses/Basecall_1D_000/Summary Group /read_003f4b93-0399-434a-8643-da1dae43ab48/Analyses/Basecall_1D_000/Summary/basecall_1d_template Group /read_003f4b93-0399-434a-8643-da1dae43ab48/Analyses/Segmentation_000 Group /read_003f4b93-0399-434a-8643-da1dae43ab48/Analyses/Segmentation_000/Summary Group /read_003f4b93-0399-434a-8643-da1dae43ab48/Analyses/Segmentation_000/Summary/segmentation Group /read_003f4b93-0399-434a-8643-da1dae43ab48/Raw Group /read_003f4b93-0399-434a-8643-da1dae43ab48/Raw/Signal Dataset {128721/Inf} /read_003f4b93-0399-434a-8643-da1dae43ab48/channel_id Group /read_003f4b93-0399-434a-8643-da1dae43ab48/context_tags Group, same as /read_0004d61f-0c92-4b10-ba33-5a5dac260f3a/context_tags /read_003f4b93-0399-434a-8643-da1dae43ab48/tracking_id Group, same as /read_0004d61f-0c92-4b10-ba33-5a5dac260f3a/tracking_id /read_0046fafb-384b-4fd4-81a6-c778c9d5e6cd Group /read_0046fafb-384b-4fd4-81a6-c778c9d5e6cd/Analyses Group /read_0046fafb-384b-4fd4-81a6-c778c9d5e6cd/Analyses/Basecall_1D_000 Group /read_0046fafb-384b-4fd4-81a6-c778c9d5e6cd/Analyses/Basecall_1D_000/BaseCalled_template Group
/ Group /Analyses Group /Analyses/Basecall_1D_000 Group /Analyses/Basecall_1D_000/BaseCalled_template Group /Analyses/Basecall_1D_000/BaseCalled_template/Fastq Dataset {SCALAR} /Analyses/Basecall_1D_000/Summary Group /Analyses/Basecall_1D_000/Summary/basecall_1d_template Group /Analyses/RawGenomeCorrected_001 Group /Analyses/RawGenomeCorrected_001/BaseCalled_template Group /Analyses/RawGenomeCorrected_001/BaseCalled_template/Alignment Group /Analyses/RawGenomeCorrected_001/BaseCalled_template/Events Dataset {29849} /Analyses/Segmentation_000 Group /Analyses/Segmentation_000/Summary Group /Analyses/Segmentation_000/Summary/segmentation Group /Raw Group /Raw/Reads Group /Raw/Reads/Read_36017 Group /Raw/Reads/Read_36017/Signal Dataset {365072/Inf} /UniqueGlobalKey Group /UniqueGlobalKey/channel_id Group /UniqueGlobalKey/context_tags Group /UniqueGlobalKey/tracking_id Group
The resluts are as follows. Nanopore sequencing data analysis is resourece-intensive and time consuming. Some potential strong recommendations are below: If your reference genome is large as human genome and your Nanopore data is huge, It would be faster to run this program parallelly to speed up. You might run different input folders of your fast5 files and give different output names (--FileID) or folders (--outFolder) A good way for this is to run different chromosome individually.
/home/cxs3_z4/software/miniconda3/envs/mdeepmod/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:516: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/cxs3_z4/software/miniconda3/envs/mdeepmod/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:517: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/cxs3_z4/software/miniconda3/envs/mdeepmod/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:518: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/cxs3_z4/software/miniconda3/envs/mdeepmod/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/cxs3_z4/software/miniconda3/envs/mdeepmod/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:519: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/cxs3_z4/software/miniconda3/envs/mdeepmod/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:520: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/cxs3_z4/software/miniconda3/envs/mdeepmod/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:525: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) /home/cxs3_z4/software/miniconda3/envs/mdeepmod/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:541: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/cxs3_z4/software/miniconda3/envs/mdeepmod/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:542: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/cxs3_z4/software/miniconda3/envs/mdeepmod/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:543: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/cxs3_z4/software/miniconda3/envs/mdeepmod/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:544: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/cxs3_z4/software/miniconda3/envs/mdeepmod/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) ard/compat/tensorflow_stub/dtypes.py:545: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/cxs3_z4/software/miniconda3/envs/mdeepmod/lib/python3.6/site-packages/tensorboard/compat/tensorflow_stub/dtypes.py:550: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) Total files=94416 WARNING:tensorflow:From /home/cxs3_z4/software/DeepMod/bin/DeepMod_scripts/myMultiBiRNN.py:30: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.
WARNING:tensorflow:From /home/cxs3_z4/software/DeepMod/bin/DeepMod_scripts/myMultiBiRNN.py:30: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead. WARNING:tensorflow:From /home/cxs3_z4/software/DeepMod/bin/DeepMod_scripts/myMultiBiRNN.py:47: static_bidirectional_rnn (from tensorflow.python.ops.rnn) is deprecated and will be removed in a future version. Instructions for updating: Please use
keras.layers.Bidirectional(keras.layers.RNN(cell, unroll=True))
, which is equivalent to this API WARNING:tensorflow:Entity <bound method MultiRNNCell.call of <tensorflow.python.ops.rnn_cell_impl.MultiRNNCell object at 0x7fed9f108c50>> could not be transformed and will be executed as-is. Please report this to the AutgoGraph team. When filing the bug, set the verbosity to 10 (on Linux,export AUTOGRAPH_VERBOSITY=10
) and attach the full output. Cause: converting <bound method MultiRNNCell.call of <tensorflow.python.ops.rnn_cell_impl.MultiRNNCell object at 0x7fed9f108c50>>: AttributeError: module 'gast' has no attribute 'Index' 2021-02-22 21:13:31.726078: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU. To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile. WARNING:tensorflow:The saved meta_graph is possibly from an older release: 'metric_variables' collection should be of type 'byte_list', but instead is of type 'node_list'. WARNING:tensorflow:From /home/cxs3_z4/software/miniconda3/envs/mdeepmod/lib/python3.6/site-packages/tensorflow/python/training/saver.py:1276: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version. Instructions for updating: Use standard file APIs to check for files with this prefix. Error!!! No events data in /home/cxs3_z4/ds/deepsignal/8/e617a13d-211d-4cac-a14f-c012fa49577f.fast5 Error!!! No events data in /home/cxs3_z4/ds/deepsignal/8/f0d90ef8-da77-4672-9798-f2ed6a46860a.fast5 Error!!! No events data in /home/cxs3_z4/ds/deepsignal/8/6537dde7-e8e2-4fb3-ac40-95897a2eeabb.fast5 Error!!! No events data in /home/cxs3_z4/ds/deepsignal/8/87ff5443-034a-42c6-8f59-a85134648d6c.fast5 Error!!! No events data in /home/cxs3_z4/ds/deepsignal/8/e8382d8f-7e94-4090-a8b9-10b2ee7cee72.fast5 2021-02-22 21:13:31.970272: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU. To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile. Error!!! No events data in /home/cxs3_z4/ds/deepsignal/8/a86bbfe4-9c90-46cb-a62b-4f2b60394b46.fast5 2021-02-22 21:13:31.991899: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set. If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU. To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile. Error!!! No events data in /home/cxs3_z4/ds/deepsignal/8/a8a9d5ce-88fb-4f32-9960-ea0182cd1fd4.fast5 . . . [M::mm_idx_gen::0.3781.07] collected minimizers [M::mm_idx_gen::0.5091.56] sorted minimizers [M::main::0.5091.56] loaded/built the index for 5 target sequence(s) [M::mm_mapopt_update::0.5431.53] mid_occ = 23 [M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 5 [M::mm_idx_stat::0.569*1.50] distinct minimizers: 1483561 (90.31% are singletons); average occurrences: 1.171; average spacing: 5.361 [M::main] Version: 2.17-r941 [M::main] CMD: minimap2 -ax map-ont /home/cxs3_z4/ds/deepmod/D18395/genome.nextpolish.fasta /tmp/tmpqziwwa6g.fa [M::main] Real time: 0.585 sec; CPU: 0.871 sec; Peak RSS: 0.444 GB Cur Prediction consuming time 29 for 0 93 Error information for different fast5 files: No events data 93416 Per-read Prediction consuming time 372 Find: ./D18395_deepmod//D18395 0 rnn.pred.ind [] Genomic-position Detection consuming time 0real 6m17.097s user 23m9.292s sys 2m8.143s
--move
parameter but it doesn't work.One of the resluts is as follows.Error!!! No move data in /home/cxs3_z4/ds/deepsignal/10/5f095916-4683-4e3b-9b5a-59feaac0c9eb.fast5
I don't know why it wrongs.