TGAC / KAT

The K-mer Analysis Toolkit (KAT) contains a number of tools that analyse and compare K-mer spectra.
http://www.earlham.ac.uk/kat-tools
GNU General Public License v3.0
206 stars 52 forks source link

KAT comp stalling when making plots #164

Open aaronphillips7493 opened 2 years ago

aaronphillips7493 commented 2 years ago

Hey,

I am trying to make some k-mer comparison plots using some short reads and a genome assembly I have done.

I ran the following command on my universities HPC with 8 threads and 20G memory: kat comp -t 8 -m 17 -o SRs_vs_contigs reads.fastq contigs.fasta

And it seems to run fine, but it has been stalled at the following step for ~15 hours now: Creating plot(s) ...Matplotlib is building the font cache; this may take a moment.

I am wondering: how long is 'a moment'? Is it reasonable to be waiting this long for the plots? Could this be a memory issue?

Thanks for making a great tool, Aaron :)

p.s. I have attached the job log file here slurm-12441689.txt

bjclavijo commented 2 years ago

Hi Aaron, that looks like a problem in your python/matplotlib installation. Typical case is matplotlib is trying to use the wrong locale, you should get the same problem by just opening a python interpreter and doing "import matplotlib". It may just be the case that you need to set the LC variables to sort it.

Best,

bj

jonwright99 commented 2 years ago

I've had this before and it's a matplotlib problem not a problem with KAT. Unfortunately I can't remember how I fixed it but I found some suggestions here - https://stackoverflow.com/questions/34771191/matplotlib-taking-time-when-being-imported.

The messages you are seeing in the KAT log indicate that you should be asking for a larger hash size for the computation so you probably want to increase this using the -H parameter (default is 100000000). However, once you have fixed the matplotlib problem you shouldn't need to rerun the whole process as the matrix file has already been generated (called SRs_vs_contigs), all you need to do is plot it using kat plot spectra-cn -o plot SRs_vs_contigs.

Best, Jon