gtluu / timsconvert

https://gtluu.github.io/timsconvert/
Apache License 2.0
32 stars 17 forks source link

AmbiguousTermWarning: Multiple unit options are possible for parameter 'base peak intensity' but none were specified #33

Closed jflucier closed 1 year ago

jflucier commented 2 years ago

Hi,

when I the program using the test data:

singularity exec --writable-tmpfs -e \
/nfs3_ib/ip29-ib/ip29/sheela_group/programs/containers/dia-ms.sif \
python /timsconvert/bin/run.py --input /nfs3_ib/ip29-ib/ip29/sheela_group/programs/containers/timsconvert_tests/massive.ucsd.edu/MSV000088438/updates/2022-02-24_gbass_3339138d/raw/timsconvert_raw_data_2/brevi_brachy.d --outdir /nfs3_ib/ip29-ib/ip29/sheela_group/programs/containers/timsconvert_tests/data --exclude_mobility --verbose

I get these warnings:

/miniconda3/lib/python3.7/site-packages/psims/mzml/components.py:585: AmbiguousTermWarning: Multiple unit options are possible for parameter 'base peak intensity' but none were specified
  self._check_params()
/miniconda3/lib/python3.7/site-packages/psims/mzml/components.py:614: AmbiguousTermWarning: Multiple unit options are possible for parameter 'base peak intensity' but none were specified
  self.write_params(xml_file)

Here is the full exec log:

convert_tests/massive.ucsd.edu/MSV000088438/raw/timsconvert_raw_data/dd/rs17e_1mgml_no_tims_1_0_C2_MS.d --outdir /nfs3_ib/ip29-ib/ip29/sheela_group/programs/containers/timsconvert_tests/data --exclude_mobility --verbose
2022-08-30_11-37-31-298235:Initialize Bruker .dll file...
2022-08-30_11-37-31-399510:Loading input data...
2022-08-30_11-37-31-399867:Reading file: /nfs3_ib/ip29-ib/ip29/sheela_group/programs/containers/timsconvert_tests/massive.ucsd.edu/MSV000088438/raw/timsconvert_raw_data/dd/rs17e_1mgml_no_tims_1_0_C2_MS.d
2022-08-30 11:37:31.526350 [tid=0x8630d180] [WARN ] bdal.io.tims.TdfReaderImpl: Requested recalibration file "/nfs3_ib/ip29-ib/ip29/sheela_group/programs/containers/timsconvert_tests/massive.ucsd.edu/MSV000088438/raw/timsconvert_raw_data/dd/rs17e_1mgml_no_tims_1_0_C2_MS.d/calibration.sqlite" does not exists. Fallback to instrument calibration from tdf.
2022-08-30_11-37-31-575524:input: /nfs3_ib/ip29-ib/ip29/sheela_group/programs/containers/timsconvert_tests/massive.ucsd.edu/MSV000088438/raw/timsconvert_raw_data/dd/rs17e_1mgml_no_tims_1_0_C2_MS.d
2022-08-30_11-37-31-575752:outdir: /nfs3_ib/ip29-ib/ip29/sheela_group/programs/containers/timsconvert_tests/data
2022-08-30_11-37-31-575839:outfile: 
2022-08-30_11-37-31-575922:mode: centroid
2022-08-30_11-37-31-576006:compression: zlib
2022-08-30_11-37-31-576082:ms2_only: False
2022-08-30_11-37-31-576160:exclude_mobility: True
2022-08-30_11-37-31-576239:encoding: 64
2022-08-30_11-37-31-576365:barebones_metadata: False
2022-08-30_11-37-31-576441:profile_bins: 0
2022-08-30_11-37-31-576516:maldi_output_file: combined
2022-08-30_11-37-31-576591:maldi_plate_map: 
2022-08-30_11-37-31-576666:imzml_mode: processed
2022-08-30_11-37-31-576741:lcms_backend: timsconvert
2022-08-30_11-37-31-576815:chunk_size: 10
2022-08-30_11-37-31-576889:verbose: True
2022-08-30_11-37-31-576964:start_frame: -1
2022-08-30_11-37-31-577038:end_frame: -1
2022-08-30_11-37-31-577112:precision: 10.0
2022-08-30_11-37-31-577206:ms1_threshold: 100
2022-08-30_11-37-31-577315:ms2_threshold: 10
2022-08-30_11-37-31-577390:ms2_nlargest: -1
2022-08-30_11-37-31-577464:version: 1.1.0
2022-08-30_11-37-31-577538:infile: /nfs3_ib/ip29-ib/ip29/sheela_group/programs/containers/timsconvert_tests/massive.ucsd.edu/MSV000088438/raw/timsconvert_raw_data/dd/rs17e_1mgml_no_tims_1_0_C2_MS.d
2022-08-30_11-37-31-577616:.tsf file detected...
2022-08-30_11-37-31-577704:Processing MALDI dried droplet data...
2022-08-30_11-37-31-577784:Initializing mzML Writer...
2022-08-30_11-37-31-581618:Initializing controlled vocabularies...
2022-08-30_11-37-32-327398:Writing mzML metadata...
2022-08-30_11-37-32-334649:Writing data to .mzML file /nfs3_ib/ip29-ib/ip29/sheela_group/programs/containers/timsconvert_tests/data/rs17e_1mgml_no_tims_1_0_C2_MS.mzML...
2022-08-30_11-37-32-335093:Calculating number of spectra...
/miniconda3/lib/python3.7/site-packages/psims/mzml/components.py:585: AmbiguousTermWarning: Multiple unit options are possible for parameter 'base peak intensity' but none were specified
  self._check_params()
/miniconda3/lib/python3.7/site-packages/psims/mzml/components.py:614: AmbiguousTermWarning: Multiple unit options are possible for parameter 'base peak intensity' but none were specified
  self.write_params(xml_file)
2022-08-30_11-37-32-378134:Updating scan count...
2022-08-30_11-37-32-381740:Finished writing to .mzML file /nfs3_ib/ip29-ib/ip29/sheela_group/programs/containers/timsconvert_tests/data/rs17e_1mgml_no_tims_1_0_C2_MS.mzML...

Is this normal behaviour or chould it be of concern?

tks

mwang87 commented 2 years ago

Thanks for asking! I think those warnings are fine, we're likely doing a slightly wrong thing in writing out the mzML without the right units in that area, but intensity generally doesnt really need units anyway, but we'll take a look!

jflucier commented 2 years ago

Maybe of interest to you, here is me recipe file to build singularity container (your HPC users could be interested):

# build new image using this command:
# singularity build --force --fakeroot dia-ms.sif dia-ms.def
# test env:
# singularity exec --writable-tmpfs -e \
# dia-ms.sif \
# python /timsconvert/bin/run.py

BootStrap: docker
From: ubuntu:20.04

%setup

%environment
    export PATH="/miniconda3/bin:$PATH"

%post
    apt-get update && apt-get -y upgrade

    ln -fs /usr/share/zoneinfo/America/New_York /etc/localtime

    apt-get -y install \
    build-essential \
    wget \
    bzip2 \
    ca-certificates \
    git \
    openjdk-17-jre \
    cpanminus \
    perl \
    software-properties-common

    rm -rf /var/lib/apt/lists/*
    apt-get clean

    cd /
    wget -c https://repo.anaconda.com/miniconda/Miniconda3-py39_4.11.0-Linux-x86_64.sh
    /bin/bash Miniconda3-py39_4.11.0-Linux-x86_64.sh -bfp /miniconda3
    export PATH=/miniconda3/bin:$PATH
    . /miniconda3/etc/profile.d/conda.sh

    echo "__conda_setup="$('/miniconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"" >> $SINGULARITY_ENVIRONMENT
    echo "if [ $? -eq 0 ]; then" >> $SINGULARITY_ENVIRONMENT
    echo "    eval "$__conda_setup"" >> $SINGULARITY_ENVIRONMENT
    echo "else" >> $SINGULARITY_ENVIRONMENT
    echo "    if [ -f \"/miniconda3/etc/profile.d/conda.sh\" ]; then" >> $SINGULARITY_ENVIRONMENT
    echo "        . \"/miniconda3/etc/profile.d/conda.sh\"" >> $SINGULARITY_ENVIRONMENT
    echo "    else" >> $SINGULARITY_ENVIRONMENT
    echo "        export PATH=\"/miniconda3/bin:$PATH\"" >> $SINGULARITY_ENVIRONMENT
    echo "    fi" >> $SINGULARITY_ENVIRONMENT
    echo "fi" >> $SINGULARITY_ENVIRONMENT
    echo "unset __conda_setup" >> $SINGULARITY_ENVIRONMENT

    conda install -y python=3.7
    conda install -y -c bioconda nextflow

    cd /
    git clone -c core.symlinks=true https://www.github.com/gtluu/timsconvert
    cd /timsconvert
    pip install -r requirements.txt
    pip install git+https://github.com/gtluu/pyimzML
gtluu commented 1 year ago

Very late to this but to elaborate on the warning this issue mentions, that was an warning from an old version of psims (0.1.34) that appeared when no intensity value was explicitly provided for the base peak intensity. This resulted in the use of the default CV param MS:1000131 number of detector counts, which should be the correct CV param. 434d61f6696859074289a51f628a8c9677087c79 has updated psims to version 1.2.7 and explicitly calls for this CV param for base peak intensity value so this should no longer appear.

Also thank you for the singularity container, I will reference this as I (slowly) continue to add various recipe files.