KlausVigo / phangorn

Phylogenetic analysis in R
http://klausvigo.github.io/phangorn/
203 stars 38 forks source link

Pratchet duplicates tips #120

Closed kbhoehn closed 3 years ago

kbhoehn commented 3 years ago

Hello - thanks for fixing the previous issues with phangorn. I've installed the latest version, but strangely pratchet sometimes returns trees with more tips than were provided as input sequences. It duplicates some tip labels. I've attached an example file which contains 31 sequences but produces a tree with 42 tips.

Example data: problem_seqs.zip

Example code:

data = readRDS("problem_seqs.rds")
tree = phangorn::pratchet(data,trace=FALSE)

paste(length(data), length(tree$tip.label)) #31 42
max(table(names(data))) #1
max(table(tree$tip.label)) #2
KlausVigo commented 3 years ago

Hello @kbhoehn, thanks again for reporting. Should be fixed with commit 6040f7d. Kind regards, Klaus

kbhoehn commented 3 years ago

Great - thanks. I tried this new commit and unfortunately it seems to now remove sequences from the tree. I've attached example sequences. You can run them using:

Example file: problem_seqs_removed.zip

data = readRDS("problem_seqs_removed.rds")
tree = phangorn::pratchet(data,trace=FALSE)

paste(length(data), length(tree$tip.label)) #11 sequences, 10 tips
KlausVigo commented 3 years ago

Thanks again @kbhoehn for the bug report and the nice reproducible example. Should be fixed now. Fingers crossed. Have a nice weekend, Klaus

kbhoehn commented 3 years ago

That seemed to work - thanks!

kbhoehn commented 3 years ago

Hi Klaus. Thanks again for fixing these issues. I am developing my own package that depends on the new changes to phangorn, and was hoping to submit it to CRAN in the next couple of weeks. It would be really helpful if the updated version of phangorn with the bug fixes were available by then. Would it be possible to update the phangorn package on CRAN in that timeframe? Best, -Ken

KlausVigo commented 3 years ago

Hi Ken, it's my plan to release it in maybe 2 weeks time. What's your package about? It's end of the semester stress at the moment. Cheers, Klaus

kbhoehn commented 3 years ago

Great - that timeline would work for me. My package is for doing phylogenetic analysis of B cell receptor sequences. This often means building hundreds of trees, which is probably why I keep finding so many strange examples. Here's the current docs page if you're interested: https://dowser.readthedocs.io.

kbhoehn commented 3 years ago

Hi Klaus - have you been able to submit the changes to CRAN? I need to submit my package early this week. Best, -Ken

KlausVigo commented 3 years ago

Hi Ken, I try to submit today. It always takes longer than expected and the checks for CRAN also take always more time. Regards, Klaus

KlausVigo commented 3 years ago

Hi Ken, it was fast today. phangorn is on the way to CRAN.

kbhoehn commented 3 years ago

Hi Klaus, Fantastic - thank you! Best, -Ken