KamilSJaron / smudgeplot

Inference of ploidy and heterozygosity structure using whole genome sequencing data
Apache License 2.0
226 stars 24 forks source link

ploidyplot never finishes #141

Open AlcaArctica opened 6 months ago

AlcaArctica commented 6 months ago

I have a weird issue with ploidyplot. It was working fine before:

PloidyPlot -e12 -k -v -T38 -o/smudgeplot/kmerpairs /smudgeplot/FastK_Table

Activating conda environment: .snakemake/conda/5e1128158bcbbb147df4df6f3dcfa370_

  The input table is untrimmed and not symmetric

  Trimming k-mers in table with count < 12

  Making trimmed table symmetric

  Starting to count covariant pairs

  Count complete, plotting

  About to save stuff

  Saving stuff

but for a while now I cant get beyond Trimming k-mers in table with count < 12, like in this example, even though nothing in my commands changed:

PloidyPlot -e12 -k -v -T38 -o/smudgeplot/kmerpairs /smudgeplot/FastK_Table

Activating conda environment: .snakemake/conda/5e1128158bcbbb147df4df6f3dcfa370_

  The input table is untrimmed and not symmetric

  Trimming k-mers in table with count < 12

So, I can let the command run many hours, but it does not produce any output and eventually the slurm scheduler cancels it due to a time limit (before it would rarely exceed one hour). I really do not understand why it worked before but not anymore. Have you ever observed anything similar? Is there some smart way to debug this problem? Thank you very much for any hints.

KamilSJaron commented 6 months ago

Uh, that is new. So far all the runs completed just fine even on my laptop. Are you sure the input data looks ok? When you make the FastK kmer database, are you able to retrieve a reasonably looking histogram? Can you post it and perhaps add what L you chose?

How big is the dataset? I wonder how impractical it would be to share it for debugging purposes.

KamilSJaron commented 4 months ago

Is there a chance of getting a bit more info here? I would like to debug this problem if I could reproduce it.