Closed MarioRinBarr closed 3 months ago
Hello,
What is the record_list.txt like? I am wondering why you are not giving the list directly like:
slow5tools get all.blow5 --list record_list.txt -o all_selected_reads.blow5
I am asking because that will be much faster and more efficient than using a bash loop. The bash loop will spawn a process of slow5tools for every single readID which means the index will have to be loaded every single time. I am not surprised if the 197768 reads took days, instead of minutes.
If you still want to stick to the bash loop method, can you replace -or “$p”.blow5
with -o “$p”.blow5
in your bash loop and see if the error persists?
Has this issue been addressed?
Closing this issue for now. If you are still having trouble, feel free to reopen.
Hello,
For an analysis that I want to perform, I am trying to generate a blow5 file filtering only a series of records that I need. So, I have used the get command, to create several separate files:
while read p; do slow5tools get all.blow5 “$p” -or “$p”.blow5 done < record_list.txt
and then I wanted to merge them all together, using the merge command:
slow5tools merge separated_records/ -o selected_data.blow5
However, when I run this last command I get an error:
[list_all_items] Looking for '*.slow5' files in separated_records/ [merge_main] 197768 files found - took 0.206s
[merge_main] Allocating new read group numbers - took 4.601s
[slow5_get_next_mem::ERROR] Malformed blow5 record. Failed to read the record size. Missing blow5 end of file marker. At src/slow5.c:3236
What could be happening to make this last command not work?
Thank you very much
Translated with DeepL.com (free version)