dieterich-lab / rp-bp

Rp-Bp is a Bayesian approach to predict, at base-pair resolution, ribosome occupancy and translation.
MIT License
7 stars 5 forks source link

setup.py imports rpbp #135

Closed lkeegan closed 1 year ago

lkeegan commented 1 year ago

@eboileau I guess this is what you were referring to about the legacy setup warnings?

Description At the end of installation in setup.py the rpbp module is imported:

https://github.com/dieterich-lab/rp-bp/blob/1ce6d0d64a91613481c195b4e60e4b3a07deb19d/setup.py#L35

When installing as a wheel this gives a ModuleNotFoundError: No module named 'rpbp' error.

It then tries to install the package again using the the legacy 'setup.py install' method where this import does work, but at some point this legacy install method might get removed.

Possible solutions Avoid importing rpbp in setup.py, e.g.

eboileau commented 1 year ago

@lkeegan I don't think this is the problem...

If I comment out references to filenames, I still get for pip install -e . --verbose

  Running setup.py develop for rpbp                                                                                                                                         
    Running command python setup.py develop                                                                                                                                 
    running develop                                                                                                                                                         
    /prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/setuptools/command/easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is dep
recated. Use build and pip and other standards-based tools.                                                                                                                 
      warnings.warn(                                                                                                                                                        
    /prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Us
e build and pip and other standards-based tools.                                                                                                                            
      warnings.warn(                                                                                                                                                        
    running egg_info 

and actually the same for pbiotools

     Running setup.py develop for pbiotools                                                                                                                                    
    Running command python setup.py develop                                       
    running develop                                                                                                                                                         
    /prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/setuptools/command/easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is dep
recated. Use build and pip and other standards-based tools.                                                                                                                 
      warnings.warn(                                                                                                                                                        
    /prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Us
e build and pip and other standards-based tools.                                                                                                                            
      warnings.warn(                                                                  
    running egg_info    

As for actually using filenames, you a re right, though, this might be a problem, but I don't want to duplicate the function and/or move it back to pbiotools. Do you think something like this would work?

parent = Path(__file__).parent
src = Path(parent, "src")
sys.path.append(src.as_posix())
lkeegan commented 1 year ago

@lkeegan I don't think this is the problem...

If I comment out references to filenames, I still get for pip install -e . --verbose

  Running setup.py develop for rpbp                                                                                                                                         
    Running command python setup.py develop                                                                                                                                 
    running develop                                                                                                                                                         
    /prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/setuptools/command/easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is dep
recated. Use build and pip and other standards-based tools.                                                                                                                 
      warnings.warn(                                                                                                                                                        
    /prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Us
e build and pip and other standards-based tools.                                                                                                                            
      warnings.warn(                                                                                                                                                        
    running egg_info 

and actually the same for pbiotools

     Running setup.py develop for pbiotools                                                                                                                                    
    Running command python setup.py develop                                       
    running develop                                                                                                                                                         
    /prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/setuptools/command/easy_install.py:144: EasyInstallDeprecationWarning: easy_install command is dep
recated. Use build and pip and other standards-based tools.                                                                                                                 
      warnings.warn(                                                                                                                                                        
    /prj/rpbp-dev/bullseye/envs/rpbp-cmdstan/lib/python3.10/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Us
e build and pip and other standards-based tools.                                                                                                                            
      warnings.warn(                                                                  
    running egg_info    

This is a different problem, I think only relevant for editable installs, I'm not sure but probably adding a toml file specifying the build system requirements would fix this?

As for actually using filenames, you a re right, though, this might be a problem, but I don't want to duplicate the function and/or move it back to pbiotools. Do you think something like this would work?

parent = Path(__file__).parent
src = Path(parent, "src")
sys.path.append(src.as_posix())

This might work when installing from source, it won't work if installing from a wheel (e.g. from pypi assuming setup is fixed so that wheels can be built). It also looks like something that will need tweaking in the future as pip is updated.

From the python packaging point of view it would be a lot cleaner (and easier to maintain in the future) to remove all of this custom code from setup.py, and instead compile the stan files (if necessary) within rp-bp

Ideally there would be a separate command to do this which could be ran in the bioconda recipe so that they would be pre-compiled for conda users.

eboileau commented 1 year ago

dear @lkeegan


I tried at some point using toml as config format, but this didn't work... I don't really know pros/cons of pyproject.toml vs setup.cfg. But it's not just only due to --editable, because even if I install without it i.e. pip install . --verbose (wheel), I get

/prj/rpbp-dev/bullseye/envs/rpbp-dec/lib/python3.10/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.

There just warnings... but I don't want to run into some longer term issues with the installation due to some deprecated install...


As for the second point, you're right, and I agree, but I'm not sure how this could be done?

Suppose we have a script in console_scripts that would do that. Then in the recipe we need to call this script after install, compile the models somewhere, then copy the directory structure with models upon conda -env create ... --file environment.yml or conda install rpbp, and make sure the paths are conforming to filenames.get_default_models_base()... or am I making this too complicated?

And for pip install?

lkeegan commented 1 year ago

I tried at some point using toml as config format, but this didn't work... There just warnings... but I don't want to run into some longer term issues with the installation due to some deprecated install...

Absolutely - here is a modern pyproject.toml setup for pbiotools which shouldn't have any longer term issues: https://github.com/dieterich-lab/pbiotools/pull/15 Hopefully that doesn't give the same warning if you try it out locally? (Python 3.6 CI fails due to the recent setuptools version requirement, this can be worked around, or since 3.6 is EOL maybe not a problem to stop supporting it for the next release?)

As for the second point, you're right, and I agree, but I'm not sure how this could be done?

Suppose we have a script in console_scripts that would do that. Then in the recipe we need to call this script after install, compile the models somewhere, then copy the directory structure with models upon conda -env create ... --file environment.yml or conda install rpbp, and make sure the paths are conforming to filenames.get_default_models_base()... or am I making this too complicated?

For bioconda, the compiling would happen as part of the bioconda recipe, not on the user's computer. The resulting executables would then be part of the rp-bp installation.

So when a user installs rp-bp it would include these pre-compiled executables, and at runtime rp-bp would check if these files exist and if so call them - otherwise it would compile them as currently done in setup.py We would have to check the details of where these files would get installed by conda and modify filenames.get_default_models_base() to locate them, but that should be possible.

And for pip install?

I would not pre-compile them with pip, and just have rp-bp compile them when it doesn't find them at run-time. (It would be possible to also pre-compile them and bundle them as wheels on pypi but probably not worth the effort to do and maintain this if most users will be using the bioconda version.)

eboileau commented 1 year ago

The install works fine. Do we actually need the empty setup.cfg file? As for Python 3.6, I don't have a problem to remove it now.

lkeegan commented 1 year ago

Great - the empty setup.cfg looks strange but is required for editable installs to work (this will eventually be fixed by https://peps.python.org/pep-0660/ but this will take a while) - and removing python 3.6 support makes sense to me.

eboileau commented 1 year ago

The dev-ssciwr branch is now installable using a toml as config format.

I have moved compilation to a utils scripts, callable as compile-rpbp-models w/o arguments.

Models are instantiated under "models_base", but for the conda install, @lkeegan I'll leave it to you, I'm not sure how to handle the compilation path there... and then how to copy them upon install. The full directory structure, incl. exe, stan, and hpp files, should ideally be copied to "models_base", given by CONDA_PREFIX, see

https://github.com/dieterich-lab/rp-bp/blob/b9431cb2b2112c3302a7ca86fe4600f1dd908bc0/src/rpbp/ribo_utils/filenames.py#L302

eboileau commented 1 year ago

Meanwhile, tests are failing on GA, because the models are not instantiated anymore at setup.

lkeegan commented 1 year ago

@eboileau after trying a few different things out I ended up with #143 as a way to do this, let me know what you think.

The corresponding bioconda recipe (still WIP but pre-compiled models are installed & working on linux, haven't tested macos): https://github.com/lkeegan/bioconda-recipes/commit/1f19d50f04d28b35203ff2298886526e2901b897

eboileau commented 1 year ago

I think this looks great...

I will try to merge this PR over the weekend, then after next Tuesday complete the documentation (might need help for RTD integration), and finalise the apps (in particular #142 )... after that, I think we'll be close to v3.0.0...