arq5x / poretools

a toolkit for working with Oxford nanopore data
MIT License
243 stars 89 forks source link

Local base calling & fast5 attribute/data format #103

Open noncodo opened 8 years ago

noncodo commented 8 years ago

I think MinKNOW's local 1D base calling (albacore?) saves data in the fast5 files differently to metrichor / nanonet basecalling. Indeed, when trying to run many poretools commands, I get similar errors that seem to relate to different attribute nomenclature in the fast5 files.

For example:

poretools 0.6.0

$ poretools yield_plot ./pass/
WARNING:poretools:No start time for ./pass/MinION_20161031_FNFAD24075_MN19348_sequencing_run_sample_57684_ch100_read2727_strand.fast5!
[same error for all reads]

Traceback (most recent call last):
  File "/Users/noncodo/miniconda2/bin/poretools", line 9, in <module>
    load_entry_point('poretools==0.6.0', 'console_scripts', 'poretools')()
  File "/Users/noncodo/miniconda2/lib/python2.7/site-packages/poretools-0.6.0-py2.7.egg/poretools/poretools_main.py", line 533, in main
    args.func(parser, args)
  File "/Users/noncodo/miniconda2/lib/python2.7/site-packages/poretools-0.6.0-py2.7.egg/poretools/poretools_main.py", line 55, in run_subtool
    submodule.run(parser, args)
  File "/Users/noncodo/miniconda2/lib/python2.7/site-packages/poretools-0.6.0-py2.7.egg/poretools/yield_plot.py", line 75, in run
    start_time = fast5.get_start_time()
  File "/Users/noncodo/miniconda2/lib/python2.7/site-packages/poretools-0.6.0-py2.7.egg/poretools/Fast5File.py", line 541, in get_start_time
    node = self.find_event_timing_block()
  File "/Users/noncodo/miniconda2/lib/python2.7/site-packages/poretools-0.6.0-py2.7.egg/poretools/Fast5File.py", line 507, in find_event_timing_block
    path = fastq_paths[self.version]['template'] % (self.group)
KeyError: 'template'

$ h5ls -r ./pass/MinION_20161031_FNFAD24075_MN19348_sequencing_run_sample_57684_ch100_read2727_strand.fast5
/                        Group
/Analyses                Group
/Analyses/Basecall_1D_000 Group
/Analyses/Basecall_1D_000/BaseCalled_template Group
/Analyses/Basecall_1D_000/BaseCalled_template/Fastq Dataset {SCALAR}
/Analyses/Basecall_1D_000/Summary Group
/Analyses/Basecall_1D_000/Summary/basecall_1d_template Group
/Raw                     Group
/Raw/Reads               Group
/Raw/Reads/Read_968      Group
/Raw/Reads/Read_968/Signal Dataset {48979/Inf}
/UniqueGlobalKey         Group
/UniqueGlobalKey/channel_id Group
/UniqueGlobalKey/context_tags Group
/UniqueGlobalKey/tracking_id Group