jeetsukumaran / DendroPy

A Python library for phylogenetic scripting, simulation, data processing and manipulation.
https://pypi.org/project/DendroPy/.
BSD 3-Clause "New" or "Revised" License
205 stars 63 forks source link

Multithreaded option fails with no error message - leaf labels issue? #180

Closed AthinaGav closed 1 month ago

AthinaGav commented 8 months ago

I was using sumtrees.py with the multithreaded option (-m) but only an empty .sumtrees file was produced and the analysis appeared to be running forever but without using any resources. The only error message, which appeared among the normal messages of processes starting and did not stop the analysis, was this:

_TypeError: cannot pickle '_io.TextIOWrapper' object Traceback (most recent call last): File ".../anaconda3/envs/dendropy/lib/python3.11/multiprocessing/queues.py", line 244, in _feed obj = ForkingPickler.dumps(obj) ^^^^^^^^^^^^^^^^^^^^^^^^^^ File ".../anaconda3/envs/dendropy/lib/python3.11/multiprocessing/reduction.py", line 51, in dumps cls(buf, protocol).dump(obj)

After a lot of trial and error, since for a while I though there was some python version conflict - which was weird because a while ago everything worked fine - I realized what the problem was. The labels in my input trees included two which the tool saw as duplicates: StRir_1 and StRiR_1. The only difference between them is a small or capital letter. I did not expect that, and it was really difficult to figure out since the error message seemed unrelated. After renaming one of the labels everything ran smoothly. I am only reporting this here, in case someone else comes across this issue. In addition, I would suggest that a more informative error message or perhaps an additional check for the multithreaded version.

Thanks again to the devs for this nice tool.

mmore500 commented 3 months ago

Hi @AthinaGav --- would it be possible to send an example script/data where this occurs so I can try to reproduce it on our end?

mmore500 commented 1 month ago

Closing this for now, feel free to reopen if this is still relevant. 👍