MASHUOA / HiTMaP

An R package of High-resolution Informatics Toolbox for Maldi-imaging Proteomics
GNU General Public License v3.0
15 stars 12 forks source link

interp1: table too short #6

Closed GraceAHall closed 5 months ago

GraceAHall commented 2 years ago

Hi team

Encountered another error. Not sure how to go about this. I am using this imzml with the uniprot_mouse.fasta proteome

Job in error state.. tool_id: hitmap, exit_code: 1, stderr: 24 Cores detected, 1 threads will be used for computing
1 files were selected and will be used for Searching
database.fasta was selected as database. Candidates will be generated through Proteomics mode
Found enzyme: trypsin
Found rule: ""
Found customized rule: ""
Testing fasta sequances for degestion site: ([KR](?=[^P]))|((?<=W)K(?=P))|((?<=M)R(?=P))
Generated 17080 Proteins in total. Computing exact masses...
Generating peptide formula...
Generating peptide formula with adducts: M+H
Calculating peptide mz with adducts: M+H
Candidate list has been exported.
database.fasta was selected as database 
Spectrum intensity threshold: 0.50% 
mz tolerance: 5 ppm Segmentation method: spatialKMeans 
Manual segmentation def file: None 
Bypass spectrum generation: FALSE
Found rotation info
Loading raw image data for statistical analysis: sample.imzML
Preparing image data for statistical analysis: sample.imzML
Error in interp1(noiseidx, noiseval, xi = t, method = "linear", extrap = mean(noiseval,  : 
  interp1: table too short
Calls: imaging_identification ... tryCatch -> tryCatchList -> tryCatchOne -> <Anonymous>
Execution halted
MASHUOA commented 2 years ago

Hi, Grace, Thanks for the information. The data set is acquired via a TOF/TOF system, which gives a mid-resolution (~15000 to 30000) m/z signal. In this condition, I will recommend you to use the >25 ppm tolerance which will trigger the mid-resolution data pre-processing workflow. The HiTMaP is mainly designed for high (or ultra-high) resolution MS data annotation. The mid-resolution data pre-processing workflow is implemented with the aim of rendering the protein image on these datasets with existing annotation results. https://github.com/MASHUOA/HiTMaP/blob/master/Resource/DB_stats_bin_mz_ppm.png . But you could still give it a go.

MASHUOA commented 2 years ago

Hi, Grace, I found it's mz range upper boundary cause this error. You need to update the hitmap via installation code (for unknown reason Cardinal package won't read the data according to a given mz.range) and set "mzrange = c(1220,1624)" in the argument. here's an example code:

datafile=c("mouse_kidney_cut.imzML") wd="~/expdata/kindney/" library(HiTMaP) imaging_identification(datafile=paste0(wd,datafile),Digestion_site="trypsin",threshold = 0, Fastadatabase="uniprot_mouse_20210107.fasta",output_candidatelist=T, preprocess=list(force_preprocess=T, use_preprocessRDS=TRUE, smoothSignal=list(method="gaussian"), reduceBaseline=list(method="locmin"), peakPick=list(method="adaptive"), peakAlign=list(tolerance=25,level="global", units="ppm"), normalize=list(method="rms")), spectra_segments_per_file=5,use_previous_candidates=T,ppm=25,FDR_cutoff = 0.05,IMS_analysis=T,Thread=1, Rotate_IMG=NULL,cluster_rds_path="/combinedimdata.rds", mzrange = c(1220,1624),plot_cluster_image_grid=T,pixel_size_um = 150,remove_score_outlier=F,attach_summary_cluster = F, remove_cluster_from_grid=F,plot_cluster_image_overwrite=T)

GraceAHall commented 2 years ago

Thanks for the solution and the quick reply!

I ran HiTMaP with the settings above, but still same error.

We have chosen to use the docker container for running hitmap as this works smoothly on the galaxy service. Is there an updated container we can use which contains the fix?

Grace