Open vortexing opened 3 years ago
Oh. My. Gosh. Just FYI, your code breaks if there is a duplicate mutation_id in a sample's dataset. It doesn't FIX the duplicate, just breaks. SUPER minor, but hey, just FYI for ease of use, perhaps a quick filter for uniqueness OR a mention in the docs. ;) I KNEW it felt like something stupid... and it was...
We've been attempting to try out pyclone-vi on our data and we're seeing this weird behavior where it works just fine when we put in like 10-20 variants per sample, but once we put the full list of 300-400 mutations, it balks. We're continuing to troubleshoot to see if it's somehow our HPC or software install environment, but on the off chance this looks familiar to you I thought I'd post the error.
The data input are data from 1 sample at a time, in the right format but there is no tumor content column or error rate column in our datasets. When the script is run, stdout only has:
Tumour content column not found. Setting values to 1.0.
, so we know things are getting to the right place and getting read in to that point, but then we're seeing this (again, only when we do not truncate our input data set to a small number of variants):Any gems? Could we have some sort of file parsing issue for a particular variant name (are there certain characters we can't use in a variant ID)? I feel like this is something silly but can't put my finger on it.