mathiaskalxdorf / IceR

Quantitative proteomics workflow
https://mathiaskalxdorf.github.io/IceR/
14 stars 4 forks source link

Error in if: missing value where TRUE/FALSE needed #18

Closed uonnet closed 2 years ago

uonnet commented 2 years ago

Hi,

I got this error when I ran IceR

Warning: Error in if: missing value where TRUE/FALSE needed
  2: shiny::runApp
  1: runIceR

What does the error mean? Thanks!

mathiaskalxdorf commented 2 years ago

Hi,

can you please provide more information. Which version of IceR do you use (master or development branch)? Which R version do you use? Which OS do you use? How many samples to you want to process? Please also provide the complete log displayed in the console.

Best regards

uonnet commented 2 years ago

Hi,

I am using the master branch (v0.9.9 I believe) and R 3.6.1 on a CentOS HPC. For this analysis I was trying to do there are 35 samples (35 files), but in some past IceR analysis with fewer (like ~12) samples, they completed without error or problem. The successful IceR runs in the past had the same config and were run on the same set-up as the present one. And here is the log failed log.txt.

There were actually some hiccups during the run. When it was preparing the all_ion_lists RData files, somehow only 18 of the 35 files were processed even though I've waited for 1 day. I restarted the analysis at this point and then the rest of the files were processed too and the run gone through until after "Addition of +1 isotope features". I waited for a few days and IceR was still not moving to the next step, and again only 18/35 files had RData files in the Extracted decoy intensities_IceR_analysis folder. So I restarted IceR again, that's why you will see IceR saying some of the files are already there, but then it really failed with that error message. There wasn't any change to the MQ result or IceR parameters or raw or mzXML files throughout.

Hope the information helps! Thanks!

mathiaskalxdorf commented 2 years ago

Hey,

your error looks similar to another one (issue #6). There a user was as well using a CentOS HPC and struggled. It looked like IceR stopped working although it was just still running and after waiting some time it actually finished without errors. The package should actually display some progressbars especially at time-consuming steps. When looking through your log it looks like you similarly stopped IceR during such steps which typically can take some time. As I´m not familiar with the centos HPC system, do you only see the R-console or also a graphical surface of the OS. If you can only see a console you might miss the additional progressbars which should be displayed (e.g. on a windows machine you get new windows opened which show progressbars with estimated residual timings). I suspect that you only see the console with the centOS setup?

Still, it looks like the whole process takes much too long. Did you adjust the numbers of CPU cores to be used by IceR for completing the task? I don´t know how many CPU cores your machine has but most likely more than 20? If there are even more than 35 cores available (and enough RAM) then I would suggest to set number of CPU cores to 35, in this case all your files are processed simultaneously. If you kept the default settings, IceR will only handle 4 tasks simultaneously (default settings) and this can result in a very long process time. Without seeing the progressbars it then might look like IceR stopped working. Furthermore, the more samples are processed, typically also the more features have to be quantified. That can then really take some time.

Can you check that an appropriate amount of CPU cores are set to be used by IceR? If not, adjust to a suitable amount of cores. To avoid running then again into the final error, I would suggest to remove the temporary files folder as well as the Decoy intensities folder in the folder where you have the raw files/mzxml.

I will add some more information to the log output to make clear that the process is running (in case where the progressbars are not visible) and e.g. how many features have to be requantified.

Just as a reference, with an appropriate number of CPU cores we are able to process 100 raw files with ~200.000 features over a weekend. So I assume that 35 samples with the right settings should be as well doable over the weekend. Still, keep in mind that IceR is performing many heavy tasks especially during 2D peak detection/DICE quantification and this step might still require some optimizations.

Best regards,

Mathias

uonnet commented 2 years ago

Hi,

Thanks for the detailed reply. Yes I gave IceR 50 cores, as high I could set in the GUI. I also made sure IceR can indeed use all 50 cores.

For progress bars, I did see them popping up as IceR was running. When it appeared to stop working, the progress bars also stopped appearing.

When you analysed 100 raw files, did you run it on Windows? And did you see IceR "stopping" and progress bars stop popping up? Do you mind sharing your parameters? I just used the default parameters for alignment windows and others.

mathiaskalxdorf commented 2 years ago

Ok maybe there is the issue. I never tested what happens if more cores are supplied than actual samples to be processed. Potentially the spare threads are waiting for a trigger which they never get. But maybe there is also a different behaviour on windows and linux. I was so far mostly testing on windows. I will check on my side the behaviour in cases where more cores are supplied than actually needed. IceR can only make use at maximum of the same number of cores as samples which should be processed. If this is causing the issue I will add a check for that to the code. Regarding parameters, I as well usually only used default parameters with the exception of adjusted CPU cores.

uonnet commented 2 years ago

Hi Mathias,

Ok I see. I am using the default parameters as well, except the number CPU cores. I can try with giving IceR 35 cores, same as the number of raw files and see if anything changes. Thanks!

uonnet commented 2 years ago

15cores failed log.txt Hi again Mathias,

I gave IceR 15 cores but it still failed at the same point, with the same error as before. It was much faster though, and also didn't "stop" processing the files.

mathiaskalxdorf commented 2 years ago

Hey,

ok this looks like a bug which I have to solve. However, this is difficult without having the data here. Would it be possible for you to upload the MaxQuant files and the "_all_ions.RData" files (stored in /raw/mzXML/ folder) somewhere, so that I can use them on my side to find the bug? If this is not feasable, you would have to execute row-by-row the code of IceR to find the spot where the error occurs. Then I most likely could solve the problem.

Best,

Mathias

uonnet commented 2 years ago

Hi Mathias,

Sorry for getting back late. Yes we can share the MaxQuant files and the "_all_ions.RData" files. Is there somewhere I can upload them for you?

Thanks again!

mathiaskalxdorf commented 2 years ago

At the moment my gdrive is full otherwise I would have suggested to upload it to my gdrive. Do you maybe have enough space on a Gdrive, OneDrive, Dropbox or any other online storage? Otherwise, I have some good experience (so far) with wetransfer.com, however, the limit is 2 GBs. What size do all the files (including MaxQ) after zipping have? Sorry for this extra trouble.

uonnet commented 2 years ago

Yes I do have enough storage on my GDrive. How do send you the link?

mathiaskalxdorf commented 2 years ago

Perfect. Can you send it to my email address (mathiaskalxdorf@gmail.com)?

uonnet commented 2 years ago

I sent you the files. The email title is "IceR error troubleshoot". Thank you!

mathiaskalxdorf commented 2 years ago

Hi, I had a look at your issue. You are trying to process samples which were measured with different LC gradient lengths on the machines. As described in the manual, IceR is not designed for that purpose and hence results in unexpected errors. To prevent such a misuse of IceR in the future, I added several checks up-front. In your case, it will directly return an error. If it would nontheless pass this check, there is another check later during the process where it determines the deviation in m/z and RT between samples for same peptides ... in your case the median deviation lies at about 15 min which is causing the problem for IceR. Here it will return some more details to the console plus warnings if the deviation is too high.

So in summary, I´m sorry but your data can not be processed with IceR as it is not designed for multi-gradientlength samples. You could only process samples which were measured with same gradient lengths batch-wise.

Best

uonnet commented 2 years ago

Hi Mathias. I checked the files again. Turns out the files actually belong to 2 batches, each with a different LC gradient, and I accidentally grouped them together for analysis. I will remove the files in the test batch and run again. Will let you know if any issue arise. Thanks for helping!