PyWiFeS / pipeline

The Python data reduction pipeline for WiFeS
6 stars 26 forks source link

Add support for pip install #24

Closed timothyfrankdavies closed 8 months ago

timothyfrankdavies commented 9 months ago

These changes are a first pass to allow the pipeline to be installed in a python environment using pip install . from the pipeline directory.

In general, the idea is to install the 'src' directory, and to leave any scripts or reference data out.

However, if the PYWIFES_DIR environment variable is not set, the code searches for the reference_data directory based on its install location. If pip installed, the old code won't work.

I've left the code (in wifes_metadata.py) as-is. It works as before if you use pip install -e . (the '-e' flag causes the source directory to be used as the install directory). It can be reconsidered in a separate change.

To run with pip install, 'PYWIFES_DIR' should point to the reference_data folder. e.g. export PYWIFES_DIR=<...>/PyWiFeS/pipeline/reference_data/.

Note I've used setup.py instead of pyproject.toml, which is old convention. We can switch whenever needed.

Changes

  1. Any script outside the src directory that imports a file from the src directory now uses an absolute import. e.g. from pywifes import <filename> or from pywifes.<filename> import <function>
  2. Any file within the src directory that imports another file from the src directory uses a relative import. e.g. from . import <filename>, or from .<filename> import <function>
  3. All imports are now done at the top of each file. I've removed some unused imports, repeated imports, and some conditional imports. Conditional imports are rarely necessary, we can add them again if they're shown to slow performance
  4. setup.py, has been added for pip, which describes the directory structure, dependencies, and metadata.
  5. mpfit_deprecated.py has been deleted. It's called deprecated, and in the src directory. It will live on in git history.
  6. working_updated_wifes_calib.py has been updated to py3. I'm unclear on whether this file is still in use.
  7. lacosmic.py no longer affects numpy and scipy warnings globally, just within its own code. This should make it safe to import.
  8. I removed a sys.path.insert(1, '/data/mash/marusa/reduction_wifes/pipeline/utils/') from generate_metadata_script_stellar.py. Utils may be added to the pip install as another change if needed.

Testing

I've tried setting up a clean virtual python environment, cloning the repo, entering the PyWiFeS/pipeline directory and running pip install .. It installs successfully (including dependencies listed in setup.py), and runs the steps in steps.sh (modified for my environment) ok.

Note there are a few files I'd expect to fail which may not be tested yet. e.g. ascii.py imports files from PyWiFeS/tools: import process_stellar as ps. PyWiFeS/tools may need similar changes to be pip installed.

timothyfrankdavies commented 8 months ago

It turns out that some python imports are famously slow.

matplotlib, pylab, and scipy can each take a few seconds to import on OzStar (partly due to compression).

Since I removed conditional imports, and structured the src directory as simply as possible, we'll import everything the first time pywifes is used. That shouldn't be an issue for testing, but we might want to revisit it towards the end of the project, if it's a problem.

timothyfrankdavies commented 8 months ago

I've retested with the latest version of automation merged in, using 'classic mode' and 'nod and shuffle' data sets.

I'm not completely sure how to verify that my results are the expected results, but the pipeline finishes running with no errors, and it produced a .p11 files in 'reduc_r' and 'reduc_b'. I think that means it succeeded.

@CMartinezLombilla Considering we're still building up our testing, I'm happy to go ahead and merge if you are.

CMartinezLombilla commented 8 months ago

Great! I'm happy to go ahead and merge so I can also test the pip installable version. In case of issues, we can always make new commitments to fix them or go back if required (don't think so). About the imports, I think that's fine for the moment. I'll add it as something to be checked and fixed later on if needed. Thank you @timothyfrankdavies !