dlce-eva / python-nexus

python-nexus - Generic nexus (.nex, .trees) reader/writer for python
BSD 2-Clause "Simplified" License
11 stars 3 forks source link

refactor combine_nexuses #42

Closed SimonGreenhill closed 1 year ago

SimonGreenhill commented 1 year ago

The utility function combine_nexuses could be made less opaque:

https://github.com/dlce-eva/python-nexus/blob/ebaa5ee2e50a00aad305c91065ecffdc34f882ff/src/nexus/tools/combine_nexuses.py#L40-L59

  1. it relabels characters by filename (specifically the short_filename attribute which is undocumented and hacky). Better to add explicit names e.g. combine_nexuses(label1=nexus_obj1, label2=nexus_obj2)

  2. It returns a NexusWriter instance rather than a NexusReader instance which is confusing. I think it only does this as combine_datablock instances a new writer and writes all data to that. It's probably quicker and less clunky to directly manipulate a NexusReader instance anyway.

xrotwang commented 1 year ago

Ah, yes. This schism between NexusReader and NexusWriter has confused me for some time, too. I can have a look at this later today.

xrotwang commented 1 year ago

@SimonGreenhill Should we consider combine_nexuses returning a NexusReader a breaking change - thus triggering v3, or rather a bug-fix?

xrotwang commented 1 year ago

@SimonGreenhill ideally, I think, v3 wouldn't have the reader/writer distinction at all anymore.

SimonGreenhill commented 1 year ago

The reader/writer distinction is because I went for the simplest way of implementing a reader, followed by a writer, and it only later became apparent that they should share lots of functionality and should be identical (or subclassed).

It leads to lots of weird little warts in lots of places that I'd love to fix but it's a major rewrite to fix.

I agree v3 should be combining R/W. How about we leave part 2 for now (until v3), but implement 1?

xrotwang commented 1 year ago

@SimonGreenhill I recently re-factored newick such that the result of parsing is a list of tokens, with enough info to get all the super-structure later quickly. I'm currently experimenting with the same approach for Nexus. At least for a reader that seems to be the way to go. But even extending this with writer functionality look possible.

SimonGreenhill commented 1 year ago

closing this as deprecated in favor of commonnexus