goodest-goodlab / pseudo-it

Beta version of the new pseudo-it software for iterative reference guided assemblies.
GNU General Public License v3.0
8 stars 0 forks source link

Processes sometimes become locked during variant calling #5

Open gwct opened 2 years ago

gwct commented 2 years ago

On some datasets (Peromyscus), processes that are spawned to call variants with GATK sometimes get locked or do not spawn a new GATK process, resulting in only 1 process being used to call variants. This is obviously a major problem since one of the big advantages of pseudo-it is that it allows users to call variants in parallel and drastically speed-up run-times.

Some notes about this issue:

Things to try:

Screenshot of htop on the Peromyscus data: pseudo-it-procs

liphardt commented 2 years ago

I am having the same issue with a phyllotis data set. Now that it is on the larger scaffolds it is processing them sequentially. phyllotis_pseudoit

gwct commented 2 years ago

Thanks for pointing this out. It's odd how this only seems to affect some datasets. I think I'm familiar with the cluster you're running on and I can see that you're on compute-0-24. Have you tried a node with more memory?

However, as per #6, this will likely not be fixed in this version of pseudo-it since I'm aiming to re-implement the whole pipeline with snakemake. Hopefully that will be mostly done (or at least in a usable form) by the end of August. Happy to chat elsewhere about this too since you likely don't want to wait that long.