Closed mantczakaus closed 1 year ago
Hi,
I did some investigation and I think the problem lies in the concurrent runs.
I have submitted the following command five times almost at once and directing results into different folders, here is just an example:
/opt/conda/bin/python /opt/iedb/mhc_ii/mhc_II_binding.py NetMHCIIpan DRB1*13:01 patient1_tumor.25.fa.split_401-600 25 > results_15/result_1131.txt 2> results_15/error_1131.txt
I think it’s because of the concurrent runs because when I run NetMHCIIpan sequentially one by one it all goes well.
I contacted Morten Nielsen who provides technical support for NetMHCIIpan and I'm waiting for his advice but in the meantime I tried to force the pipeline to run pVACseq sequentially instead of in parallel. I tried to do this by using buffer directive in pVACseq process like so:
process 'pVACseq' {
tag "${meta.sampleName}"
label 'pVACtools'
input:
tuple(
val(meta),
path(vep_phased_vcf_gz),
path(anno_vcf),
val(hla_types),
val(tumor_purity),
path(iedb_install_ok)
) from mkPhasedVCF_out_pVACseq_ch0
.join(vcf_vep_ex_gz, by: [0])
.combine(hlas.splitText(), by: 0)
.combine(purity_estimate_ch1, by: 0)
.combine(iedb_install_out_ch)
.buffer(1)
But all it did was it did not trigger any of the pVACseq processes. Could you recommend a workaround? Sth that will make pVACseq run one by one and not in parallel for all the HLA alleles?
Hi,
that's interesting. I never hit that issue. I'll have a look into it as well, but right now I'm a bit busy.
Did you try to reduce the number of cpus for pVACseq, e.g.:
change confg/process.config
form
withName:pVACseq {
cpus = 10
}
to
withName:pVACseq {
cpus = 2
}
Hi, thanks for coming back to me. The developer of NetMHCIIpan was unable to reproduce, he recommended for me to contact IEDB support - which is what I'm going to do. In the meantime someone from Nextflow community recommended to set maxForks to 1 which worked! pVACseq is run one by one and the pipelien run successfully! Was your recommendation to setting cpus to 1 with the same aim? https://www.nextflow.io/docs/latest/process.html#maxforks
Hi,
I'm glad that the maxForks
worked. It is not exactly the same as setting the cpus
, but I'd expect that the effect will be similar.
Hi, I would like to ask for help in consequences of swapping containers for pVACtools. But first I want to explain why I'm doing this. 1) I was running predictions with your original containers but I was getting an error for MHC class II peptides while predicting binding affinity using NetMHCIIpan
I did some research and I did not see this error reported by anyone else and also I could not reproduce it consistently! I thought maybe it's some of the nodes on the HPC or maybe it depends on the order the predictions I made. Either way, I didn't want to investigate but first I thought I'd try to use a newer versions of mhc_i and mhc_ii tools. 2) I renamed the nextNEOpi_1.3_resources/databases/iedb folder and I replaced the urls in params.config with the following
This way I thought I tricked the pipeline to re-install them. The new MHCI was installed fine but MHCII wasn't. THe process downloaded the tar.gz file, unpacked it but then during running configure.py the pipeline threw the following error:
3) I tried to install manually MHCII tools but I run into problems with installing some perl modules. I skipped this because I don't really like troubleshooting perl problems. Instead, I reverted the IEDB installation folder but decided to use the newer pVACtools container that have them. 4) So I pulled the container from here https://hub.docker.com/r/griffithlab/pvactools/tags and changed the singularity run options, i.e. I removed binding of IEDB and MHCflurry folders. I changed this
runOptions = "--no-home --containall" + " -H " + params.singularityTmpMount + " -B " + params.singularityAssetsMount + " -B " + params.singularityTmpMount + " -B " + params.resourcesBaseDir + params.singularityHLAHDmount + " -B " + params.databases.IEDB_dir + ":/opt/iedb" + " -B " + params.databases.MHCFLURRY_dir + ":/opt/mhcflurry_data"
into this:runOptions = "--no-home --containall" + " -H " + params.singularityTmpMount + " -B " + params.singularityAssetsMount + " -B " + params.singularityTmpMount + " -B " + params.resourcesBaseDir + params.singularityHLAHDmount
And I changed the container in process.configThis didn't help either though. I'm still getting a similar error:
I'll reach out to developers of NetMHCII too but maybe you have some recommendations how to work around that? Best wishes, Magda