nanoporetech / ont_fast5_api

Oxford Nanopore Technologies fast5 API software
Other
144 stars 28 forks source link

sequencing_summary.txt for fast5_subset output #35

Closed mmiladi closed 4 years ago

mmiladi commented 4 years ago

Hi,

The output fast5 files of fast5_subset program does not rproduce the sequencing_summary.txt. Is there a way to reproduce a summary file based on the original summary file? Or alternatively is there a way to generate summary file using the fast5 files without base calling again?

Also it would be great if this feature would be supported by the program.

Best, -Milad

fbrennen commented 4 years ago

Hi @mmiladi -- fast5_subset produces a file called filename_mapping.txt that contains all the read_ids in the subset. You should be able to use that file to extract the relevant reads from your sequencing summary file very easily. Does that make sense?

mmiladi commented 4 years ago

Thanks for your reply @fbrennen . I would like to use the sequencing_summary.txt as input for the downstream tools (specifally nanopolish). So do you mean I intersect the Would you think I can intersect read entry names of filename_mapping.txt with the original sequencing_summary.txt to produce a new valid summary file?

fbrennen commented 4 years ago

Hi @mmiladi -- yes, you should be able to do exactly that.

mmiladi commented 4 years ago

Thanks for your assistance @fbrennen