iqbal-lab-org / make_prg

Code to create a PRG from a Multiple Sequence Alignment file
Other
21 stars 7 forks source link

make_prg update PR series: 1. make_prg/from_msa refactoring/updates #33

Closed leoisl closed 2 years ago

leoisl commented 2 years ago

Starting the series of PRs to the dev branch. We first start with make_prg/from_msa source code dir.

@bricoletc unfortunately there were two units that I could not easily add to the code in this PR, so I reverted to what was before. This is a shame, as there is development time that was spent to make and tests these units, but adding them would make code readability, understanding and maintenance more complicated, so I decided to not add them. These were the class ClusteringResult and the function merge_sequences. However, this is just how we represent the results of the clustering. The core logic of the latest PR, which was to not cluster in case of too few unique sequences or alignment ambiguities was incorporated into the code, and will be described in next PRs.

make_prg/from_msa/prg_builder.py was moved outside of make_prg/from_msa, as it is now used by both the from_msa and update commands, and it was also heavily refactored due to the need of representing the recursion tree explicitly. Will be included in upcoming PRs, but the changes are such that the two files are not comparable.

tests/from_msa/__init__.py was refactored and moved with a number of other helper functions to tests/test_helpers.py.

Next PR will come after this one is approved/merged.

mbhall88 commented 2 years ago

As mentioned in a previous thread, the coding_convention.md file added in https://github.com/iqbal-lab-org/make_prg/pull/33/commits/9ce064b88288625139b11060bda232a6b9e75955 should be renamed to CONTRIBUTING.md as this is the conventional file used to describe any dev-specific information about the codebase.

bricoletc commented 2 years ago

One last comment + renaming the contribution file to CONTRIBUTING.md and we're good here

great work leandro thank you :)