continuing the update PR series, now moving to make_prg/utils module. This module was created in this PR, before we had seq_utils.py, io_utils.py and prg_encoder.py inside the make_prg dir. With some refactoring and with quite a good amount of new code, the number of utils files increased from 3 to 7, so I decided to create an utils module. The annoying part of moving + refactoring though is that git sometimes does not cope well with these simultaneous changes, and for some files, e.g. seq_utils.py, it says make_prg/seq_utils.py was completely removed, and make_prg/utils/seq_utils.py is a completely new file, while it was just a file renaming and several changes. This makes reviewing a bit more laborious, but if you prefer we can improve on this (i.e. commit the file renaming operations to this branch, and then PR just the code changes).
Some points to help on review:
Class GFA_Output was extracted from make_prg/io_utils.py and refactored into its own source file, make_prg/utils/gfa.py with very few changes;
Many functions in make_prg/utils/io_utils.py are marked with a note that they are not unit tested, as it would be a bit complicated to unit test them. However, they are tested by integration tests. I think the only function that remained in this source file from the previous code is load_alignment_file(), all the others are new;
The code in files make_prg/utils/gfa.py and make_prg/utils/recursive_tree_drawer.py are tested through a single big bang test (i.e. a single test that check if they produce the expected output). This is not ideal, and it is more of a regression test checking if we keep producing the same output over the next versions. However, I think it is fine, as these are not crucial outputs produced by make_prg: we don't often look at the .gfa files, mostly for debugging. The recursive tree drawer is exclusive for debugging, not for user consumption, to understand the recursive tree generated by make_prg, which is rather complicated to visualise in debug mode;
Hello,
continuing the update PR series, now moving to
make_prg/utils
module. This module was created in this PR, before we hadseq_utils.py
,io_utils.py
andprg_encoder.py
inside themake_prg
dir. With some refactoring and with quite a good amount of new code, the number of utils files increased from 3 to 7, so I decided to create an utils module. The annoying part of moving + refactoring though is that git sometimes does not cope well with these simultaneous changes, and for some files, e.g.seq_utils.py
, it saysmake_prg/seq_utils.py
was completely removed, andmake_prg/utils/seq_utils.py
is a completely new file, while it was just a file renaming and several changes. This makes reviewing a bit more laborious, but if you prefer we can improve on this (i.e. commit the file renaming operations to this branch, and then PR just the code changes).Some points to help on review:
GFA_Output
was extracted frommake_prg/io_utils.py
and refactored into its own source file,make_prg/utils/gfa.py
with very few changes;make_prg/utils/io_utils.py
are marked with a note that they are not unit tested, as it would be a bit complicated to unit test them. However, they are tested by integration tests. I think the only function that remained in this source file from the previous code isload_alignment_file()
, all the others are new;make_prg/utils/gfa.py
andmake_prg/utils/recursive_tree_drawer.py
are tested through a single big bang test (i.e. a single test that check if they produce the expected output). This is not ideal, and it is more of a regression test checking if we keep producing the same output over the next versions. However, I think it is fine, as these are not crucial outputs produced bymake_prg
: we don't often look at the.gfa
files, mostly for debugging. The recursive tree drawer is exclusive for debugging, not for user consumption, to understand the recursive tree generated bymake_prg
, which is rather complicated to visualise in debug mode;prg_encoder.py
has basically no changes.Thanks a lot for the help!