morrislab / pairtree

Pairtree is a method for reconstructing cancer evolutionary history in individual patients, and analyzing intratumor genetic heterogeneity. Pairtree focuses on scaling to many more cancer samples and cancer cell subpopulations than other algorithms, and on producing concise and informative interactive characterizations of posterior uncertainty.
MIT License
37 stars 11 forks source link

Pairtree generating nonexistent variant IDs, leading to KeyError #52

Closed cathy-y closed 2 months ago

cathy-y commented 2 months ago

Hello, thanks for creating this tool! I installed Pairtree successfully using the instructions in the README, but I'm getting this error when I run it:

Traceback (most recent call last): File "/home/cayan/pairtree/bin/pairtree", line 177, in <module> main() File "/home/cayan/pairtree/bin/pairtree", line 105, in main supervars = clustermaker.make_cluster_supervars(clusters, variants) File "/home/cayan/pairtree/bin/../lib/clustermaker.py", line 113, in make_cluster_supervars cvars = [variants[vid] for vid in cluster] File "/home/cayan/pairtree/bin/../lib/clustermaker.py", line 113, in <listcomp> cvars = [variants[vid] for vid in cluster] KeyError: 's8341'

It seems that it is trying to find a mutation with the ID "s8341". However, my input only has s8341 SNVs, so the IDs are contiguous from "s0" to "s8340". My ssm and parameters files are numbered in the exact same way, and neither have "s8341" as an ID. The highest numbered ID is "s8340". There are no duplicate IDs, and no overlaps in IDs across clusters.

I'm not sure why Pairtree is generating IDs that are not in any of my inputs.

Thanks in advance!

cathy-y commented 2 months ago

Some additional information, if it helps:

ethanumn commented 2 months ago

This error is generated precisely when a cluster in your parameter file has an ID that is not present in the ssm file. It seems like you've already checked to make sure this is true. If you Ctrl+F the ID s8431 in the parameter file you find nothing? It's possible you have some sort of formatting issue that could also be causing this.

cathy-y commented 2 months ago

When I Ctrl+F the offending ID in the parameter file, I find nothing. I've formatted the inputs as described, and am happy to send you the files via email so you can test it out on your end.

ethanumn commented 2 months ago

Interesting. If you send the ssm and parameters files I should be able to resolve this quickly.