MASHUOA / HiTMaP

An R package of High-resolution Informatics Toolbox for Maldi-imaging Proteomics
GNU General Public License v3.0
15 stars 12 forks source link

BiocParallel issue #4

Closed GraceAHall closed 3 years ago

GraceAHall commented 3 years ago

Hi there,

Attempting to wrap HiT-MaP so it can be available on the Galaxy platform. Running into a number of issues during execution using your provided docker image. Here is the first:

Loading raw image data for statistical analysis: sample.imzML
Preparing image data for statistical analysis: sample.imzML
Error in result[[njob]] <- value :
  attempt to select less than one element in OneIndex
Calls: imaging_identification ... chunk_apply -> bplapply -> bplapply -> bploop -> bploop.lapply
In addition: Warning message:
In parallel::mccollect(wait = FALSE, timeout = 1) :
  1 parallel job did not deliver a result
Execution halted

This error is very common depending on the parameters supplied to imaging_identification().

Here is the script which resulted in the above error:

library(HiTMaP)
datafile = "Mouse_brain.imzML"
database = "uniprot_mouse_20210107.fasta"
wd = "~/expdata/"

imaging_identification(
    datafile = paste0(wd,datafile),
    Fastadatabase = database,
    threshold = 0.005,
    ppm = 10,
    FDR_cutoff = 0.05,
    Decoy_mode = "isotope",
    Digestion_site = "trypsin",
    missedCleavages = 0:1,
    adducts = c("M+H"),
    Modifications = list(
            fixed = NULL,
            fixmod_position = NULL,
            variable = NULL,
            varmod_position = NULL
    ),
    spectra_segments_per_file=9,
    Segmentation="spatialKMeans",
    preprocess=list(
        force_preprocess=TRUE,
        use_preprocessRDS=TRUE,
        smoothSignal=list(method="Disable"),
        reduceBaseline=list(method="Disable"),
        peakPick=list(method="adaptive"),
        peakAlign=list(tolerance=10, units="ppm"),
        normalize=list(method=c("rms","tic","reference")[1],mz=1)
    ),
    output_candidatelist=T,
    use_previous_candidates=F,
    Smooth_range=1,
    Virtual_segmentation=FALSE,
    Virtual_segmentation_rankfile=NULL,
    score_method="SQRTP",
    peptide_ID_filter=2,
    Protein_feature_summary=TRUE,
    Peptide_feature_summary=TRUE,
    Region_feature_summary=TRUE,
    plot_cluster_image_grid=F,
    Rotate_IMG=NULL,
    Thread=8,
    IMS_analysis=T
)

This error occurs very frequently. Other examples:

xiweifan commented 3 years ago

Hi Grace,

Thank you for your help. I think I submitted the application for this R software to galaxy Australia a while ago. Thank you so much for your help.

To my knowledge, the problem is often caused by the inadequate amount of RAM if you are using the nat_comms version in HPC or docker on PC. If you can check my previous issue you may find a similar question and how George solves it. The easiest way to solve it is to limit the thread to 1 and enlarge the RAM allocation, say 1.5TB if possible, and narrow it down when you succeeded once.

Kind regards, Xiwei

GraceAHall commented 3 years ago

Thanks Xiwei :)

HiTMaP should be up on the galaxy toolshed in the next few days. Just trying to iron out these last few issues so it's more stable.

I have done a number of runs using 1 thread on a computer with 64Gb of RAM and haven't had any issues! Its only about 50% quicker with 12 threads vs 1 thread anyway so probably a better setup.

Thanks for your reply. Grace