tanghaibao / goatools

Python library to handle Gene Ontology (GO) terms
BSD 2-Clause "Simplified" License
749 stars 212 forks source link

Fixed bug in write_hierarchy_base.py #219

Closed avigailtaylor closed 3 years ago

avigailtaylor commented 3 years ago

In the previous version of the code, if a term had already been printed, then the hierarchy below it would not be printed again, however the term itself would be. While this stopped whole hierarchies being duplicated, it did not prevent some terms being printed more than once. In this version of the code, the conditional testing for no repetition (if no_repeat:) is moved to the top of function prt_hier_rec, which fixes this bug; this also has the bonus effect of preventing code being run that does not need to run.

dvklopfenstein commented 3 years ago

This sounds really good. Can you add a test?

tanghaibao commented 3 years ago

@avigailtaylor

Thank you for your contribution. There is just one test to fix in tests/test_write_hier.py:

--- a/tests/test_write_hier.py
+++ b/tests/test_write_hier.py
@@ -43,8 +43,7 @@ def write_hier_norep(gosubdag, out):
     assert set(gos_printed) == set(objwr.gosubdag.go2nt)
     assert gos_printed == ['GO:0000001', 'GO:0000002', 'GO:0000005', 'GO:0000010',
                            'GO:0000003', 'GO:0000004', 'GO:0000007', 'GO:0000009',
-                           'GO:0000005', 'GO:0000006', 'GO:0000008', 'GO:0000009',
-                           'GO:0000010']
+                           'GO:0000006', 'GO:0000008']

 def write_hier_lim(gosubdag, out):

Please fix as suggested in your branch - once the test passes we'll approve your change.

avigailtaylor commented 3 years ago

Offline, we have discussed why this pull request needed to be rejected. I'm following it up with another pull request that addresses the issues raised here.