Closed lauzikaite closed 7 years ago
Hi!
It's very difficult to identify the source of your error from here. There error might result from a reading error. Anyway the error should not be related to IPO, as IPO itself does not accesses the files. This is done by xcms.
So you can confirm, that all your files together work with xcmsSet()
with and without parallelisation?
Maybe those two Stackoverflow-question give you some helpful hint?
Otherwise: Can you make your files available? Then I can offer to try to reproduce your error.
Thanks for your reply!
Yes, all tested files work with xcmsSet()
both with (through BPPARAM setting) and without parallelisation.
Regarding Stackoverflow suggestions, parallel::parSapply()
with FUN= xcmsRaw()
via PSOCK cluster works fine with these files. Which suggests that file reading is not the problem?
Sample mzML files are in dropbox here (too large to upload to github).
Is there anything else I could try?
Thank you for the files. I'll have a look as soon as possible.
Can you confirm, that optimizeXcmsSet
works with BPPARAM = MulticoreParam(workers = 1)
?
At the moment I assume that there is a problem with the combined use of IPO's nSlaves
argument and xcms BPPARM
argument. Above suggestion to set workers = 1
suppresses xcms parallelisation.
Can you additionally try setting nSlaves = 1
as in the following code (thus only using xcms parallelisation)?
opp <- optimizeXcmsSet(files = files,
params = ppparam,
BPPARAM = MulticoreParam(workers = 5),
nSlaves = 1,
subdir = output_dir)
Thank you for the suggestions.
At the moment I can confirm that optimizeXcmsSet
works with BPPARAM = MulticoreParam(workers = 1)
.
I will check whether setting nSlaves = 1
, while retaining BPPARAM = MulticoreParam(workers = 5)
works and will get back to you.
I run few more tests of optimizeXcmsSet
to check which nSlaves
and workers
settings don't work together: +, function was executed; -, function was halted.
What is the default setting for nSlaves? When nSlaves is not specified, the function behaves differently from nSlaves = 1, which I assumed is the default. Apologies for a naive question, but how it can be 0?
Thank you very much for your tests. The error seems to come from an interaction between the parallel
-package and the BiocParallel
-package. I'll check that and turn back to you also with answers to your questions.
Ok. It really seems that the problem arises from using the package parallel
(so IPO nSlaves
argument > 1) and BiocParallel
(so BPPARAM
argument with workers > 1) together. To answer one of your questions, the default value for nSlaves
is 4. I cannot say why it was chosen that way, but I tend to keep it for compatibility reasons. So on using the BPPARAM
-argument (without setting nSlaves
) the parallelisation crashes.
I wonder why an nSlaves
argument of NA
works for you in any case, as IPO should crash. I'll add a warning message and fix to that.
Regarding your questions: NA
does not mean not specifying nSlaves
. It's setting nSlaves
to have the value NA
. I'm not sure about your question regarding 0, as NA
is different from 0. Anyway IPO treats all numeric values of nSlaves <= 1
the same, which is "do not use parallelisation". Some more argument checking would be nice, but there's no urgent reason. You can create an enhancement issue, if you would like me to remind it for the future.
I'll update IPO here and on Bioconductor the next days to close this issue.
In the meantime the quick solution is: Either set nSlaves = 1
or use BPPARAM
with workers = 1.
Thank you for looking into this, I will use the simpler version of the function until package is updated.
ok, the check for nSlaves argument is adressed by dcaba57f5dcc5afe49b1dd8a1ef99b59374bc077 and the issue itself by 355c240954c7cce7f21f5cd52c38b3fa10119e7e
The changes are also pushed to Bioconductor release and development version. It might take some time to show up there.
@lauzikaite Thank you very much for your help
Hi,
I am running optimizeXcmsSet() on LC-ToF-MS dataset - only on 5 pooled sample mzML files - and receive the following error after around ~10min, during which 5 independent R processes were successfully initiated:
xcmsSet() on these files works perfectly fine. I am able to run xcmsSet() through both parallel (using PSOCK cluster) and BiocParallel packages (using BPPARAM).
The script for IPO optimisation:
What could went wrong with optimizeXcmsSet()?
Thank you in advance for you help!