Shenhav-and-Korem-labs / q2-SCRuB

BSD 3-Clause "New" or "Revised" License
2 stars 2 forks source link

R error when running SCRuB in qiime2-2023.5 #4

Closed callaband closed 10 months ago

callaband commented 10 months ago

For my plate that was dropped on the floor plus additional samples.

Installed SCRuB in qiime2-2023.5 environment using the pip install [successful, normal]. Used this command (CLI):

!qiime SCRuB SCRuB \
    --i-table ../data/urine/179648_urine_trim100_deblur_feature-table.qza \
    --m-metadata-file ../data/14472_metadata_updated.txt \
    --p-control-idx-column empo_2 \
    --p-sample-type-column sample_type \
    --p-well-location-column well_id \
    --o-scrubbed ../data/urine/179648_urine_trim100_deblur_ft_scrubbed.qza

Error received:

Plugin error from SCRuB:

An error was encountered while running SCRuB in R (return code 1), please inspect stdout and stderr to learn more.

Debug info has been saved to /var/folders/m1/h11rhh850_s7fjxgxw5wzh_r0000gp/T/qiime2-q2cli-err-ta3bkl4x.log

Log file:

Running SCRuB on Qiime2!
/opt/anaconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/q2_SCRuB/_method.py:150: FutureWarning
: The behavior of .astype from SparseDtype to a non-sparse dtype is deprecated. In a future version,
 this will return a non-sparse array with the requested dtype. To retain the old behavior, use `obj.
astype(SparseDtype(dtype))`
  table.to_csv(biom_fp, header=True)
Running external command line application(s). This may print messages to stdout and/or stderr.
The command(s) being run are below. These commands cannot be manually re-run as they will depend on 
temporary files that no longer exist.

Command: run_SCRuB.R --samples_counts_path /var/folders/m1/h11rhh850_s7fjxgxw5wzh_r0000gp/T/tmp56rtk
aog/samples.csv --sample_metadata_path /var/folders/m1/h11rhh850_s7fjxgxw5wzh_r0000gp/T/tmp56rtkaog/
metadata.csv --control_order NA --output_path /var/folders/m1/h11rhh850_s7fjxgxw5wzh_r0000gp/T/tmp56
rtkaog/scrubbed.Rdata

R version 4.2.3 (2023-03-15) 
Error in library(SCRuB, quietly = TRUE) : 
  there is no package called ‘SCRuB’
Calls: suppressMessages -> withCallingHandlers -> library
Execution halted
Command '['run_SCRuB.R', '--samples_counts_path', '/var/folders/m1/h11rhh850_s7fjxgxw5wzh_r0000gp/T/
tmp56rtkaog/samples.csv', '--sample_metadata_path', '/var/folders/m1/h11rhh850_s7fjxgxw5wzh_r0000gp/
T/tmp56rtkaog/metadata.csv', '--control_order', 'NA', '--output_path', '/var/folders/m1/h11rhh850_s7fjxgxw5wzh_r0000gp/T/tmp56rtkaog/scrubbed.Rdata']' returned non-zero exit status 1.
Traceback (most recent call last):
  File "/opt/anaconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/q2_SCRuB/_method.py", line 162, in SCRuB
    run_commands([cmd])
  File "/opt/anaconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/q2_SCRuB/_method.py", line 39, in run_commands
    subprocess.run(cmd, check=True)
  File "/opt/anaconda3/envs/qiime2-2023.5/lib/python3.8/subprocess.py", line 516, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['run_SCRuB.R', '--samples_counts_path', '/var/folders/m1/h11rhh850_s7fjxgxw5wzh_r0000gp/T/tmp56rtkaog/samples.csv', '--sample_metadata_path', '/var/folders/m1/h11rhh850_s7fjxgxw5wzh_r0000gp/T/tmp56rtkaog/metadata.csv', '--control_order', 'NA', '--output_path', '/var/folders/m1/h11rhh850_s7fjxgxw5wzh_r0000gp/T/tmp56rtkaog/scrubbed.Rdata']' returned non-zero exit status 1.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/opt/anaconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/q2cli/commands.py", line 468, in __call__
    results = action(**arguments)
  File "<decorator-gen-24>", line 2, in SCRuB
  File "/opt/anaconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/qiime2/sdk/action.py", line 274, in bound_callable
    outputs = self._callable_executor_(
  File "/opt/anaconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/qiime2/sdk/action.py", line 509, in _callable_executor_
    output_views = self._callable(**view_args)
  File "/opt/anaconda3/envs/qiime2-2023.5/lib/python3.8/site-packages/q2_SCRuB/_method.py", line 165, in SCRuB
    raise Exception("An error was encountered while running SCRuB"
Exception: An error was encountered while running SCRuB in R (return code 1), please inspect stdout and stderr to learn more.
gaustin15 commented 10 months ago

Hi, thank you for sharing the traceback -- based on the error message, it looks like SCRuB's R component wasn't installed. Could you try running this script that installs SCRuB and its key dependencies, and makes sure that q2-SCRuB can execute on the demo data.

conda activate qiime2-2023.5
conda install -c conda-forge -c bioconda -c r r-devtools
Rscript -e 'devtools::install_github("Shenhav-and-Korem-labs/SCRuB")'
Rscript -e 'torch::install_torch()'
Rscript -e 'library(SCRuB)' # just to make sure we can run this line
pip install git+https://github.com/Shenhav-and-Korem-labs/q2-SCRuB.git
mkdir SCRuB-example # setting up the data
mkdir SCRuB-example/plasma-data
mkdir SCRuB-example/results
cd SCRuB-example/plasma-data
wget https://github.com/Shenhav-and-Korem-labs/q2-SCRuB/raw/main/ipynb/plasma-data/table.qza
wget https://github.com/Shenhav-and-Korem-labs/q2-SCRuB/raw/main/ipynb/plasma-data/metadata.tsv
cd ..
qiime SCRuB SCRuB --i-table plasma-data/table.qza --m-metadata-file plasma-data/metadata.tsv --p-control-idx-column is_control --p-sample-type-column sample_type --p-well-location-column well_id --p-control-order "control blank library prep,control blank DNA extraction" --o-scrubbed results/scrubbed.qza

If this script also gives you an error, could you please let me which line caused the error and share the traceback?

callaband commented 10 months ago

It seems like it is getting stuck at essentially the first step conda install -c conda-forge -c bioconda -c r r-devtools, messages:

Retrieving notices: ...working... done
Collecting package metadata (current_repodata.json): done
Solving environment: / 
The environment is inconsistent, please check the package plan carefully
The following packages are causing the inconsistency:

  - qiime2/label/r2023.5/osx-64::q2-vsearch==2023.5.0=py38_0
  - qiime2/label/r2023.5/osx-64::q2-feature-classifier==2023.5.0=py38_0
  - qiime2/label/r2023.5/osx-64::q2-feature-table==2023.5.0=py38_0
  - qiime2/label/r2023.5/osx-64::q2-composition==2023.5.0=py38_0
  - qiime2/label/r2023.5/osx-64::q2-longitudinal==2023.5.0=py38_0
  - qiime2/label/r2023.5/osx-64::q2-diversity==2023.5.1=py38_0
  - qiime2/label/r2023.5/osx-64::provenance-lib==2023.5.1=py38_0
  - conda-forge/noarch::seaborn==0.12.2=hd8ed1ab_0
  - bioconda/noarch::gneiss==0.4.6=py_0
  - qiime2/label/r2023.5/osx-64::q2-quality-control==2023.5.0=py38_0
  - qiime2/label/r2023.5/osx-64::q2-demux==2023.5.0=py38_0
  - qiime2/label/r2023.5/osx-64::q2-sample-classifier==2023.5.0=py38_0
  - qiime2/label/r2023.5/osx-64::q2-gneiss==2023.5.0=py38_0
failed with initial frozen solve. Retrying with flexible solve.
Collecting package metadata (repodata.json): done
Solving environment: / 
The environment is inconsistent, please check the package plan carefully
The following packages are causing the inconsistency:

  - qiime2/label/r2023.5/osx-64::q2-vsearch==2023.5.0=py38_0
  - qiime2/label/r2023.5/osx-64::q2-feature-classifier==2023.5.0=py38_0
  - qiime2/label/r2023.5/osx-64::q2-feature-table==2023.5.0=py38_0
  - qiime2/label/r2023.5/osx-64::q2-composition==2023.5.0=py38_0
  - qiime2/label/r2023.5/osx-64::q2-longitudinal==2023.5.0=py38_0
  - qiime2/label/r2023.5/osx-64::q2-diversity==2023.5.1=py38_0
  - qiime2/label/r2023.5/osx-64::provenance-lib==2023.5.1=py38_0
  - conda-forge/noarch::seaborn==0.12.2=hd8ed1ab_0
  - bioconda/noarch::gneiss==0.4.6=py_0
  - qiime2/label/r2023.5/osx-64::q2-quality-control==2023.5.0=py38_0
  - qiime2/label/r2023.5/osx-64::q2-demux==2023.5.0=py38_0
  - qiime2/label/r2023.5/osx-64::q2-sample-classifier==2023.5.0=py38_0
  - qiime2/label/r2023.5/osx-64::q2-gneiss==2023.5.0=py38_0
 \

Still working but no new messages after 5 min - since it was basically the same message twice, it was possibly stuck in some kind of loop; forced cancel to stop.

gaustin15 commented 10 months ago

Is this being run in a newly created qiime environment? If not, would it be possible to try creating a fresh qiime environment and running that installation there? This script runs in a new qiime2-2023.5 env on my machine, so I'm guessing that the other pre-existing environment dependencies are causing the problems here.

Another workaround for that line would be to install devtools directly in R within the qiime environment (although I've found that this approach is usually slower in qiime than running the conda install, so I'd recommend exploring the new qiime env approach first)

callaband commented 10 months ago

Okay, in a new qiime2-2023.5 environment the pip install of SCRuB pip install git+https://github.com/Shenhav-and-Korem-labs/q2-SCRuB.git says successful

Installing collected packages: backports.zoneinfo, tzlocal, rpy2, pyreadr, SCRuB
Successfully installed SCRuB-0+untagged.51.ga1b7926 backports.zoneinfo-0.2.1 pyreadr-0.5.0 rpy2-3.5.14 tzlocal-5.2

but Rscript -e 'devtools::install_github("Shenhav-and-Korem-labs/SCRuB")' results in

Error in loadNamespace(name) : there is no package called ‘devtools’
Calls: :: ... loadNamespace -> withRestarts -> withOneRestart -> doWithOneRestart
Execution halted

so then I conda install -c conda-forge -c bioconda -c r r-devtools, got basically the same error as before:

Collecting package metadata (current_repodata.json): | WARNING conda.models.version:get_matcher(546): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.7.1.*, but conda is ignoring the .* and treating it as 1.7.1
done
Solving environment: - 
The environment is inconsistent, please check the package plan carefully
The following packages are causing the inconsistency:

  - defaults/osx-64::_anaconda_depends==2021.11=py38_0
  - defaults/osx-64::anaconda==custom=py38_1
  - conda-forge/noarch::dask==2023.3.2=pyhd8ed1ab_0
  - conda-forge/osx-64::unixodbc==2.3.10=h7b58acd_0
  - conda-forge/osx-64::libgoogle-cloud==2.8.0=h176059f_1
  - conda-forge/osx-64::pyodbc==4.0.39=py38h4cd09af_0
  - conda-forge/noarch::parquet-cpp==1.5.1=2
  - conda-forge/osx-64::libarrow==11.0.0=h547aefa_13_cpu
  - conda-forge/osx-64::arrow-cpp==11.0.0=h694c41f_13_cpu
  - conda-forge/osx-64::pyarrow==11.0.0=py38h5866706_13_cpu
failed with initial frozen solve. Retrying with flexible solve.
Solving environment: failed with repodata from current_repodata.json, will retry with next repodata source.
Collecting package metadata (repodata.json): - WARNING conda.models.version:get_matcher(546): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.6.0.*, but conda is ignoring the .* and treating it as 1.6.0
WARNING conda.models.version:get_matcher(546): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.9.0.*, but conda is ignoring the .* and treating it as 1.9.0
WARNING conda.models.version:get_matcher(546): Using .* with relational operator is superfluous and deprecated and will be removed in a future version of conda. Your spec was 1.8.0.*, but conda is ignoring the .* and treating it as 1.8.0
done
Solving environment: 

Been trying to solve the environment for 10 minutes and didn't finish, so aborted

Opened RStudio (I am not much of an R coder, but I can install things!) and install.packages("devtools") and it did a lot of things before I got

ERROR: dependencies ‘usethis’, ‘pkgdown’ are not available for package ‘devtools’
* removing ‘/Library/Frameworks/R.framework/Versions/4.0/Resources/library/devtools’
Warning in install.packages :
  installation of package ‘devtools’ had non-zero exit status

Then I tried mamba install -c conda-forge -c bioconda -c r r-devtools - mamba sometimes is faster and better at resolving issues than standard conda and got this error:

Multi-download failed. Reason: Transfer finalized, status: 404 [https://conda.anaconda.org/conda-forge/ca-certificates-2023.11.17-h8857fd0_0.conda] 961 bytes

# >>>>>>>>>>>>>>>>>>>>>> ERROR REPORT <<<<<<<<<<<<<<<<<<<<<<

    Traceback (most recent call last):
      File "/opt/anaconda3/lib/python3.8/site-packages/conda/exceptions.py", line 1132, in __call__
        return func(*args, **kwargs)
      File "/opt/anaconda3/lib/python3.8/site-packages/mamba/mamba.py", line 945, in exception_converter
        raise e
      File "/opt/anaconda3/lib/python3.8/site-packages/mamba/mamba.py", line 938, in exception_converter
        exit_code = _wrapped_main(*args, **kwargs)
      File "/opt/anaconda3/lib/python3.8/site-packages/mamba/mamba.py", line 884, in _wrapped_main
        result = do_call(parsed_args, p)
      File "/opt/anaconda3/lib/python3.8/site-packages/mamba/mamba.py", line 756, in do_call
        exit_code = install(args, parser, "install")
      File "/opt/anaconda3/lib/python3.8/site-packages/mamba/mamba.py", line 557, in install
        transaction.fetch_extract_packages()
    RuntimeError: Multi-download failed. Reason: Transfer finalized, status: 404 [https://conda.anaconda.org/conda-forge/ca-certificates-2023.11.17-h8857fd0_0.conda] 961 bytes

`$ /opt/anaconda3/condabin/mamba install -c conda-forge -c bioconda -c r r-devtools`

  environment variables:
                 CIO_TEST=<not set>
        CMAKE_PREFIX_PATH=:/opt/anaconda3/envs/SCRuB
        CONDA_BACKUP_HOST=CFAs-MacBook-Pro.local
        CONDA_DEFAULT_ENV=SCRuB
                CONDA_EXE=/opt/anaconda3/bin/conda
             CONDA_PREFIX=/opt/anaconda3/envs/SCRuB
           CONDA_PREFIX_1=/opt/anaconda3
    CONDA_PROMPT_MODIFIER=(SCRuB)
         CONDA_PYTHON_EXE=/opt/anaconda3/bin/python
               CONDA_ROOT=/opt/anaconda3
              CONDA_SHLVL=2
    CONDA_TOOLCHAIN_BUILD=x86_64-apple-darwin13.4.0
     CONDA_TOOLCHAIN_HOST=x86_64-apple-darwin13.4.0
           CURL_CA_BUNDLE=<not set>
     JAVA_LD_LIBRARY_PATH=/opt/anaconda3/envs/SCRuB/lib/jvm/lib/server
               LD_PRELOAD=<not set>
                     PATH=/opt/anaconda3/envs/SCRuB/bin:/opt/anaconda3/condabin:/opt/anaconda3/b
                          in:/opt/anaconda3/condabin:/usr/local/bin:/usr/bin:/bin:/usr/sbin:/sbi
                          n:/opt/X11/bin:/Applications/Postgres.app/Contents/Versions/latest/bin
                          :/Applications/Postgres.app/Contents/Versions/latest/bin
         PYTHONNOUSERSITE=/opt/anaconda3/envs/SCRuB/lib/python*/site-packages/
       REQUESTS_CA_BUNDLE=<not set>
            SSL_CERT_FILE=<not set>

     active environment : SCRuB
    active env location : /opt/anaconda3/envs/SCRuB
            shell level : 2
       user config file : /Users/user/.condarc
 populated config files : /Users/callaband/.condarc
          conda version : 23.3.1
    conda-build version : 3.21.4
         python version : 3.8.16.final.0
       virtual packages : __archspec=1=x86_64
                          __osx=10.16=0
                          __unix=0=0
       base environment : /opt/anaconda3  (writable)
      conda av data dir : /opt/anaconda3/etc/conda
  conda av metadata url : None
           channel URLs : https://conda.anaconda.org/conda-forge/osx-64
                          https://conda.anaconda.org/conda-forge/noarch
                          https://conda.anaconda.org/bioconda/osx-64
                          https://conda.anaconda.org/bioconda/noarch
                          https://conda.anaconda.org/r/osx-64
                          https://conda.anaconda.org/r/noarch
                          https://repo.anaconda.com/pkgs/main/osx-64
                          https://repo.anaconda.com/pkgs/main/noarch
                          https://repo.anaconda.com/pkgs/r/osx-64
                          https://repo.anaconda.com/pkgs/r/noarch
          package cache : /opt/anaconda3/pkgs
                          /Users/user/.conda/pkgs
       envs directories : /opt/anaconda3/envs
                          /Users/user/.conda/envs
               platform : osx-64
             user-agent : conda/23.3.1 requests/2.31.0 CPython/3.8.16 Darwin/21.6.0 OSX/10.16
                UID:GID : 502:20
             netrc file : None
           offline mode : False

An unexpected error has occurred. Conda has prepared the above report.

Any ideas? Maybe send me your .yml file?

callaband commented 10 months ago

Okay, conda install -c conda-forge r-devtools works! Followed by

Rscript -e 'devtools::install_github("Shenhav-and-Korem-labs/SCRuB")'
Rscript -e 'torch::install_torch()'
Rscript -e 'library(SCRuB)' # just to make sure we can run this line
pip install git+https://github.com/Shenhav-and-Korem-labs/q2-SCRuB.git
mkdir SCRuB-example # setting up the data
mkdir SCRuB-example/plasma-data
mkdir SCRuB-example/results
cd SCRuB-example/plasma-data
wget https://github.com/Shenhav-and-Korem-labs/q2-SCRuB/raw/main/ipynb/plasma-data/table.qza
wget https://github.com/Shenhav-and-Korem-labs/q2-SCRuB/raw/main/ipynb/plasma-data/metadata.tsv
cd ..
qiime SCRuB SCRuB --i-table plasma-data/table.qza --m-metadata-file plasma-data/metadata.tsv --p-control-idx-column is_control --p-sample-type-column sample_type --p-well-location-column well_id --p-control-order "control blank library prep,control blank DNA extraction" --o-scrubbed results/scrubbed.qza

No errors, success!

gaustin15 commented 10 months ago

Awesome; thank you for following up, glad to hear that it ran succesfully! Please feel free to reopen if you run into any other problems when running SCRuB on your plate