"can't find current frequency file" error

cm760 commented 5 years ago

Hello,

I've recently started using this pipeline for a project I'm working on and I've run into an issue. I ran one fasta file, one bam file and its corresponding bed file, and a custom config file, using the following command line invocation:

nextflow run nf-core/deepvariant --fasta indexreference/PlasmoDB-41_Pfalciparum3D7_Genome.fasta --bam alignments/ISO_349.bam --bed coverage_decay_plots/PlasmoDB-41_Pfalciparum3D7_Genome_ISO_349.per-base.bed.gz -c configs/conf/prince.config

I received an error message, and I'm not quite sure what it means. I've run this pipeline twice using the same command line invocation, and I got the same error message both times, as attached here and (partially) written below: nextflow.log.txt

File "/opt/conda/envs/nf-core-deepvariant-1.0/lib/python2.7/site-packages/psutil/_pslinux.py", line 701, in cpu_freq "can't find current frequency file") NotImplementedError: can't find current frequency file parallel: This job failed: /opt/conda/envs/nf-core-deepvariant-1.0/bin/python /opt/conda/envs/nf-core-deepvariant-1.0/share/deepvariant-0.7.0-0/binaries/DeepVariant/0.7.0/DeepVariant-0.7.0+cl-208818123/make_examples.zip --mode calling --ref PlasmoDB-41_Pfalciparum3D7_Genome.fasta.gz --reads ISO_349.bam --regions PlasmoDB-41_Pfalciparum3D7_Genome_ISO_349.per-base.bed.gz --examples ISO_349_shardedExamples/ISO_349.bam.tfrecord@16.gz --task 10

I was successfully able to get the image tag in the appropriate directory (it showed up as nfcore-deepvariant-1.0.img), but it always fails on the make_examples process with this error. Is this a problem with the job not having enough resources to run, or a different issue entirely?

For reference, I'm running this on my Mac using the HPC at my institution with a slurm executor, with macOS High Sierra (10.13.4). I have version 19.04.0 of Nextflow installed, and the container I'm using is Singularity version 3.2.1.

Thank you very much in advance!

PhilPalmer commented 5 years ago

Hi @cm760,

Thanks for using the pipeline, sorry it's not working as expected.

Unfortunately I haven't seen this error message before.

Are you running it with any profiles? If not try appending -profile standard,singularity onto your command, eg:

nextflow run nf-core/deepvariant --fasta indexreference/PlasmoDB-41_Pfalciparum3D7_Genome.fasta --bam alignments/ISO_349.bam --bed coverage_decay_plots/PlasmoDB-41_Pfalciparum3D7_Genome_ISO_349.per-base.bed.gz -c configs/conf/prince.config -profile standard,singularity

cm760 commented 5 years ago

Hi @PhilPalmer, thank you very much for your reply!

I have tried rerunning the pipeline using the command you specified, but it returned the same error message. I then tried running it again, with -profile standard, singularity but without -c configs/conf/prince.config, because maybe there was an issue with our config file; unfortunately, it produced the same error message again.

My colleague and I are unsure of how to continue, so any guidance would be much appreciated.

PhilPalmer commented 5 years ago

No worries

I just spotted that your BED is compressed which may be causing the error. Can you please try uncompressing it & rerunning the command

cm760 commented 5 years ago

I tried uncompressing it and running the pipeline with the following command (we used a different bed file but for our purposes it does not matter):

nextflow run nf-core/deepvariant --fasta indexreference/PlasmoDB-41_Pfalciparum3D7_Genome.fasta --bam alignments/ISO_349.bam --bed PlasmoDB-41_Pfalciparum3D7_Genome.fasta.bed -c configs/conf/prince.config -profile standard

We were met with the same error. We also noticed that our fasta and bed files had headers in them that we thought might have interfered with the pipeline, so we tried running the same command with our new, "clean header" fasta and bed files. No such luck.

PhilPalmer commented 5 years ago

Hi, sorry for the late response.

Can I ask what is in your prince.config file?

Also, I don't think this is causing the error but how come you're not using the singularity profile as well as standard in the command above?

All of the files seem to be being passed to DeepVariant correctly from the nextflow.log.txt file which you sent. I also tested running the pipeline with similar parameters to you which worked with the following command:

nextflow run main.nf --fasta testdata/hg19.fa --bam testdata/NA12878_S1.chr20.10_10p1mb.bam --bed testdata/test_nist.b37_chr20_100kbp_at_10mb.bed -profile standard,docker

This makes me think it could be a problem with the input data. Perhaps due to the fact that the models being used were trained on human data & not Plasmodium.

As I am not sure I have made an issue on the main DeepVariant repo here: https://github.com/google/deepvariant/issues/191

cm760 commented 5 years ago

The prince.config is the one listed here, for our institution. The file specifies that singularity is enabled, what executor to use, and where to publish the singularity images.

Also sorry, I forgot to mention - I tried using both -profile standard as well as -profile standard,singularity and because I received the same error message for both tries, I assumed that this wasn't the issue.

Thanks for making the issue, we'll be following that. We were kind of excited to try and use this pipeline for our project, but we/I didn't consider the implications of using it for Plasmodium instead of human data.

PhilPalmer commented 5 years ago

Are you able to send me the files to test with?

Are they small and/or publically available?

tobsecret commented 5 years ago

Thanks for your help, @PhilPalmer! I'm the supervisor on the project where we meant to use nf-core/deepvariant and ran into this issue. It seems like google/deepvariant#191 established that it was an issue with the deepvariant codebase (i.e. a call to ps_utils which does not work on our cluster for some reason). So it should be fixed for us as soon as they make a new release and the container for nf-core/deepvariant gets an update.

We're gonna test with some publicly available files, and send you an update.

nf-core / deepvariant

"can't find current frequency file" error #21