HenrikBengtsson / CBI-software

A Scientific Software Stack for HPC (CentOS oriented)
https://wynton.ucsf.edu/hpc/software/software-repositories.html
5 stars 2 forks source link

Cutadapt make - python2 module #46

Closed hgputnam closed 2 years ago

hgputnam commented 2 years ago

First real issue was cutadapt. It was trying to load a python2 module that I don't know how to get.

Original Makefile:

include config.mk
include ../utils.mk

## Based on the official instructions in Section 'Shared installation (on a cluster)'
## of the 'Installation' document
## https://github.com/marcelm/cutadapt/blob/main/doc/installation.rst
$(INSTALL_TARGET):
    module purge; \
    module load CBI; \
    module load python/2.7.15 2> /dev/null || true; \
    python --version; \
    module list; \
    mkdir -p "$(PREFIX)/bin"; \
    virtualenv "$(PREFIX)/venv"; \
    . "$(PREFIX)/venv/bin/activate"; \
    python -m pip install --upgrade pip; \
    python -m pip install $(NAME)==$(VERSION)
    ln -fs $(PREFIX)/venv/bin/cutadapt $(PREFIX)/bin/
    ls -la "$(PREFIX)"
    ls -la "$(PREFIX)/bin"
    @echo "SOFTWARE INSTALLED TO: $(PREFIX)"

So I changed the above to use the "built-in" python 3 and also added the --user option:

include config.mk
include ../utils.mk

## Based on the official instructions in Section 'Shared installation (on a cluster)'
## of the 'Installation' document
## https://github.com/marcelm/cutadapt/blob/main/doc/installation.rst
$(INSTALL_TARGET):
    module purge; \
    module load CBI; \
    module load python/2.7.15 2> /dev/null || true; \
    python --version; \
    module list; \
    mkdir -p "$(PREFIX)/bin"; \
    python3 -m venv "$(PREFIX)/venv"; \
    . "$(PREFIX)/venv/bin/activate"; \
    python3 -m pip install --upgrade --user pip; \
    python3 -m pip install --user $(NAME)==$(VERSION)
    ln -fs $(PREFIX)/venv/bin/cutadapt $(PREFIX)/bin/
    ls -la "$(PREFIX)"
    ls -la "$(PREFIX)/bin"
    @echo "SOFTWARE INSTALLED TO: $(PREFIX)"

After that, make finished with no errors. Of course I have no idea what cutadapt does or how to test it. Was this the right thing to do Or did I do something silly?

HenrikBengtsson commented 2 years ago

FYI, this is legacy software that is on the CBI stack only because it was on TIPCC before I started to set up the CBI stack. I'm not sure if anyone is actually using it.

The

https://github.com/HenrikBengtsson/CBI-software/blob/92f1ab53730426a84baa622cf4a24ebcf8eea967/CBI/cutadapt/Makefile#L10

was probably added during the TIPCC era, where I think we only had Python 2.6. Note how it's made optional by redirecting stderr to /dev/null and use ... || true to make sure it always returns true (so that make won't stop).

It was designed for Python 2 for sure, so I don't know how well it runs on Python 3. Are you sure you had to switch to python3?

PS. I'm assuming you're doing this for testing purposes and not for production somewhere else.

HenrikBengtsson commented 2 years ago

FYI 2, when using a virtual environment, you don't want to use --user, cf. https://wynton.ucsf.edu/hpc/howto/python.html

hgputnam commented 2 years ago

t was designed for Python 2 for sure, so I don't know how well it runs on Python 3. Are you sure you had to switch to python3?

I am not sure of that. I will re-try with 2

PS. I'm assuming you're doing this for testing purposes and not for production somewhere else.

I am building it on marlowe. I was planning to release the ones that make no problem for use. I am telling Dave, we can't supply support for this though. I was hoping to learn enough that I could show him or a staff member to maintain it. Honestly, if we just give him the r modules he would be happy. So half for my own education and hopefully something useful comes of it.

hgputnam commented 2 years ago

Forgot to mention, the other thing I am trying to test out here is the idea of installing this repo locally to each compute node. It seems like the make stuff for individual modules is a pretty manual process. I was sort of doinf the makes in ascii order but I found a couple of modules that were needed and not there yet. So I skipped ahead, did the make for those, and then went back and made the dependant module.

Anyway, the idea I want to test is installing these on one node and then using an ansible playbook to just copy the software home /software/cbi to all the other nodes.

HenrikBengtsson commented 2 years ago

Honestly, if we just give him the r modules he would be happy

I would keep it simple and start out with that. Maintaining modules is still quite a bit of work. I try to make it easier and easier each time I touch this. But, as you discovered, we still have no easy way to do have a master make install-everything-from-scratch or make update-everything-to-the-most-recent-version - one have to go through and call make for everything that should be installed or updated, and in the correct order.

I'd also like to add a unit-test framework, so that one can run make validate on each installed software tool to make sure things are actually working.

Another thing I'd like is to have a make check-for-updates. Right now, I have to manually go through each module and go to the different webpages to check for updates.

So, I prefer to move slow but steady on my end (and maintain this for C4 and Wynton, which I try to keep in sync)

hgputnam commented 2 years ago

It is a fantastic resource. Even a non-developer like me is able to figure out most things. Something I learned from you is a cluster is only as good as the software on it.

hgputnam commented 2 years ago

Can't get it to go with python2.7. The main difficulty is the pip upgrade won't work, even directly from the CLI:

$ pip install --upgrade pip
Collecting pip
  Using cached https://files.pythonhosted.org/packages/33/c9/e2164122d365d8f823213a53970fa3005eb16218edcfc56ca24cb6deba2b/pip-22.0.4.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-build-RwPXXs/pip/setup.py", line 7
        def read(rel_path: str) -> str:
                         ^
    SyntaxError: invalid syntax

    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-build-RwPXXs/pip/
You are using pip version 8.1.2, however version 22.0.4 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.

I am just leaving this module out...

HenrikBengtsson commented 2 years ago

pip install --upgrade pip

  1. Don't call pip directly, see https://wynton.ucsf.edu/hpc/howto/python.html#upgrading-pip
  2. make install does not require you to upgrade pip outside - it takes care of it and does it inside that virtual environment. It won't touch your personal, user stack.

FYI, make install works fine on my Ubuntu 18.04 with Python 2.7.17. Having said, that, I'm surprised, because the docs said that it requires Python (>= 3.6), cf. https://cutadapt.readthedocs.io/en/stable/installation.html#dependencies

HenrikBengtsson commented 2 years ago

FYI, make install works fine on my Ubuntu 18.04 with Python 2.7.17. Having said, that, I'm surprised, because the docs said that it requires Python (>= 3.6), cf. https://cutadapt.readthedocs.io/en/stable/installation.html#dependencies

Never mind, it might actually be something with my personal virtualenv; it appears to setup a virtual environment with Python 3.6;

$ command -v virtualenv
/home/hb/.local/bin/virtualenv

$ virtualenv /tmp/venv/
created virtual environment CPython3.6.9.final.0-64 in 272ms
  creator CPython3Posix(dest=/tmp/venv, clear=False, global=False)
  seeder FromAppData(download=False, pip=bundle, setuptools=bundle, wheel=bundle, via=copy, app_data_dir=/home/hb/.local/share/virtualenv)
    added seed packages: pip==20.2.2, setuptools==49.6.0, wheel==0.35.1
  activators BashActivator,CShellActivator,FishActivator,PowerShellActivator,PythonActivator,XonshActivator

which gives me:

$ cutadapt
This is cutadapt 3.4 with Python 3.6.9
Command line parameters: 
Run "cutadapt --help" to see command-line options.
See https://cutadapt.readthedocs.io/ for full documentation.

cutadapt: error: You did not provide any input file names. Please give me something to do!
hgputnam commented 2 years ago

Verified with the 'built-in' centos 7 python3. Makefile:

include config.mk
include ../utils.mk

## Based on the official instructions in Section 'Shared installation (on a cluster)'
## of the 'Installation' document
## https://github.com/marcelm/cutadapt/blob/main/doc/installation.rst
$(INSTALL_TARGET):
    module purge; \
    module load CBI; \
    module load python/2.7.15 2> /dev/null || true; \
    python --version; \
    module list; \
    mkdir -p "$(PREFIX)/bin"; \
    python3 -m venv "$(PREFIX)/venv"; \
    . "$(PREFIX)/venv/bin/activate"; \
    python3 -m pip install --upgrade pip; \
    python3 -m pip install $(NAME)==$(VERSION)
    ln -fs $(PREFIX)/venv/bin/cutadapt $(PREFIX)/bin/
    ls -la "$(PREFIX)"
    ls -la "$(PREFIX)/bin"
    @echo "SOFTWARE INSTALLED TO: $(PREFIX)"

I now get similar output:

[ansible@node14 cutadapt]$ cutadapt
This is cutadapt 3.7 with Python 3.6.8
Command line parameters: 
Run "cutadapt --help" to see command-line options.
See https://cutadapt.readthedocs.io/ for full documentation.

cutadapt: error: You did not provide any input file names. Please give me something to do!
HenrikBengtsson commented 2 years ago

I think I've fixed this in commit c99b901; updated

virtualenv "$(PREFIX)/venv"; \

to

virtualenv -p python3 "$(PREFIX)/venv";