Now the polbin binary can both read and write two formats: text GFA files and "FlatGFA" binary files. So these four commands are possible:
$ polbin < something.gfa # round trip through in-memory FlatGFA, print GFA to stdout
$ polbin -o cool.flatgfa < something.gfa # convert GFA to FlatGFA
$ polbin -i cool.flatgfa # print a FlatGFA file out as plain ol GFA, to stdout
$ polbin -i cool.flatgfa -o ice_cold.flatgfa # glorified `cp`, no reason to do this
I also added test environments to check both kinds of round-tripping (through in-memory FlatGFA and through an on-disk file). It all works!!!!!
$ turnt -j -e polbin_mem -e polbin_file *.gfa
1..16
ok 1 - DRB1-3123.gfa polbin_mem
ok 2 - DRB1-3123.gfa polbin_file
ok 3 - LPA.gfa polbin_mem
ok 4 - LPA.gfa polbin_file
ok 5 - chr6.C4.gfa polbin_mem
ok 6 - chr6.C4.gfa polbin_file
ok 7 - k.gfa polbin_mem
ok 8 - k.gfa polbin_file
ok 9 - note5.gfa polbin_mem
ok 10 - note5.gfa polbin_file
ok 11 - overlap.gfa polbin_mem
ok 12 - overlap.gfa polbin_file
ok 13 - q.chop.gfa polbin_mem
ok 14 - q.chop.gfa polbin_file
ok 15 - t.gfa polbin_mem
ok 16 - t.gfa polbin_file
Conversion seems to be decently fast on these small examples. For our go-to big example, chr8.pan.gfa (4.2 GB), one run of conversion on my rapidly aging Intel iMac took 1m8s for parsing (GFA -> FlatGFA) and 1m44s for pretty-printing (FlatGFA -> GFA). Seems within the ballpark of reasonableness? (Moreover, the GFA seems to have round-tripped successfully. FWIW, just running diff to check took 22s.)
Now the
polbin
binary can both read and write two formats: text GFA files and "FlatGFA" binary files. So these four commands are possible:I also added test environments to check both kinds of round-tripping (through in-memory FlatGFA and through an on-disk file). It all works!!!!!
Conversion seems to be decently fast on these small examples. For our go-to big example,
chr8.pan.gfa
(4.2 GB), one run of conversion on my rapidly aging Intel iMac took 1m8s for parsing (GFA -> FlatGFA) and 1m44s for pretty-printing (FlatGFA -> GFA). Seems within the ballpark of reasonableness? (Moreover, the GFA seems to have round-tripped successfully. FWIW, just runningdiff
to check took 22s.)