Closed callumparr closed 4 years ago
Hi @callumparr -- are you looking for the compress_fast5
script?
https://github.com/nanoporetech/ont_fast5_api#compress_fast5
I am looking for something like this but going from VBZ to GZIP so I can use on fast5 out files containing squiggles generated with master of pores pipeline that are then compatible with tailfindr and nanopolish polya
find . -name "*.fast5" | xargs -P 10 -I % h5repack -f UD=32020,5,0,0,2,1,1 % %.vbz
Or even overwriting the in files to simplify things
find . -name "*.fast5" | xargs -P 10 -I % sh -c "h5repack -f UD=32020,5,0,0,2,1,1 % %.vbz && mv %.vbz %"
Can I just simply replace -f filter with GZIP=1 and .gzip ?
compress_fast
goes both ways -- you can convert the raw dataset from gzip to vbz, and also from vbz to gzip. Is that good enough?
Yeh sorry I think this is more to do my basic understanding of informatics.
Great, so you have everything you need then?
Sorry forgot to close
I have seen on the VBZ compression repository containing the not hdf5 plugin for working with VBZ compression but is there any plan to make a subtotal within ont_fast5_api?
I am little apprehensive changing the -f parameter with h5repack and what it does exactly. It would also be good to have some system of naming output files matching the input for list of fast5 files.