edwardbirdlab / BALROG-MON

A Nextflow pipeline for Antimicrobial Resistatnce exploration in metagenomic samples
Other
2 stars 1 forks source link

plasmer_sorting.py and kraken2_out2blob.py should be put into a container because of dependencies #17

Closed molikd closed 5 months ago

molikd commented 5 months ago

plasmer_sorting.py and kraken2_out2blob.py should be put into a container because of dependencies, don't want to assume that the user has the relevant libraries installed for each.

molikd commented 5 months ago

This will prevent this problem:

Command error:
  INFO:    Environment variable SINGULARITYENV_TMPDIR is set, but APPTAINERENV_TMPDIR is preferred
  INFO:    Environment variable SINGULARITYENV_NXF_TASK_WORKDIR is set, but APPTAINERENV_NXF_TASK_WORKDIR is preferred
  WARNING: The directory '/home/david.molik/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.
  ERROR: Could not install packages due to an OSError: [Errno 30] Read-only file system: '/home/david.molik'

  WARNING: The directory '/home/david.molik/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.
  ERROR: Could not install packages due to an OSError: [Errno 30] Read-only file system: '/home/david.molik'

  WARNING: The directory '/home/david.molik/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.
  ERROR: Could not install packages due to an OSError: [Errno 30] Read-only file system: '/home/david.molik'

  WARNING: The directory '/home/david.molik/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.
  ERROR: Could not install packages due to an OSError: [Errno 30] Read-only file system: '/home/david.molik'

  WARNING: The directory '/home/david.molik/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.
  ERROR: Could not install packages due to an OSError: [Errno 30] Read-only file system: '/home/david.molik'

  Traceback (most recent call last):
    File "/90daydata/musca/NECE/round_two/BALROG-MON/bin/plasmer_sorting.py", line 11, in <module>
      import pandas as pd
  ModuleNotFoundError: No module named 'pandas'

Work dir:
  /90daydata/musca/NECE/round_two/BALROG-MON/work/11/bbfa511511ca0e6598c09fa723315f
molikd commented 5 months ago

I did eventually find a workaround for this:

set PYTHON environmental variables in config

env.PYTHONUSERBASE = '/90daydata/musca/NECE/round_two/BALROG-MON/.local'
env.PYTHONPATH = '/90daydata/musca/NECE/round_two/BALROG-MON/.local'

install required dependencies to local dir:

 pip install --target /90daydata/musca/NECE/round_two/BALROG-MON/.local/lib/ pandas
 pip install --target /90daydata/musca/NECE/round_two/BALROG-MON/.local/lib/ numpy
 pip install --target /90daydata/musca/NECE/round_two/BALROG-MON/.local/lib/ Bio
 pip install --target /90daydata/musca/NECE/round_two/BALROG-MON/.local/lib/ seaborn

and just in case change the following in plasmer_sorting.py and kraken2_out2blob.py

for mod in modules:
    command = 'pip3 install ' + mod
    subprocess.run(command, shell=True)

to

for mod in modules:
    command = 'pip3 install --no-cache-dir --user ' + mod
    subprocess.run(command, shell=True)

obviously, very messy and not ideal.

edwardbirdlab commented 5 months ago

Docker containers have been made for both scripts (plasmer_sorting.py and kraken2_out2blob.py) and all dependences come preinstalled. They have consequently been removed from the bin.