paris-saclay-cds / ramp-workflow

Toolkit for building predictive workflows on top of pydata (pandas, scikit-learn, pytorch, keras, etc.).
https://paris-saclay-cds.github.io/ramp-docs/
BSD 3-Clause "New" or "Revised" License
68 stars 43 forks source link

ENH add the external_imports directory to sys.path #279

Closed martin1tab closed 3 years ago

martin1tab commented 3 years ago

Closes #181

With this PR merged, users would be able to import packages copied to external_imports directory. This PR adds the external_imports directory to sys.path.

Let's imagine this kit structure:

problem.py data/ external_imports/ ......utils/ ............test_imports.py

Then to import test_imports.py, it is sufficient to do (from inside problem.py or estimator/classifier.py): from utils import test_imports

albertcthomas commented 3 years ago

Please stay in the original PR #277 and change the code in the original PR.

martin1tab commented 3 years ago

Please stay in the original PR #277 and change the code in the original PR.

You are right, sorry I have lost my fork, so could not open my original PR

albertcthomas commented 3 years ago

After discussing with @kegl and @avirmaux in real life, if we put the ramp_kit_dir in sys.path then we don't control what users will be able to import, they'll have access to everything in the ramp_kit_dir which could be bad (correct me if I'm wrong). Thus we keep the idea of the folder where you are able to do imports, this folder could be named external_imports and would contain utils/ or any other folder and by adding external_imports to sys.path you would be able to do from utils import foo. It's a bit more obscure for a python user (maybe) but the name of the folder and documentation would help. WDYT @rth? @martin1tab?

rth commented 3 years ago

Thus we keep the idea of the folder where you are able to do imports, this folder could be named external_imports and would contain utils/ or any other folder and by adding external_imports to sys.path you would be able to do from utils import foo

This could work indeed as long as it's documented.

Though in the end wouldn't be simpler to document that people can set PYTHONPATH to add arbitrary folders to the Python import path?

PYTHONPATH="/some_extra_packages/"  ramp-test

(or when starting the notebook)

That would be a fairly standard solution with fewer constraints about hard-coded folder paths. Unless it wouldn't work for some reason?

kegl commented 3 years ago

I'd prefer adding a standard external_imports so when we deploy the RAMP kit at the backend, we don't need to do any extra steps, ramp-test should work out of the box.

codecov[bot] commented 3 years ago

Codecov Report

Merging #279 (50b2da2) into master (593b9fb) will increase coverage by 0.06%. The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #279      +/-   ##
==========================================
+ Coverage   82.98%   83.05%   +0.06%     
==========================================
  Files         137      137              
  Lines        4684     4703      +19     
==========================================
+ Hits         3887     3906      +19     
  Misses        797      797              
Impacted Files Coverage Δ
rampwf/tests/test_kits.py 97.01% <100.00%> (+0.86%) :arrow_up:
rampwf/utils/testing.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 593b9fb...50b2da2. Read the comment docs.

rth commented 3 years ago

Could we do without adding rampwf/tests/kits/titanic_test_external_imports folder with all the data files? For testing it you could for instance create a tempdir, copy one of the existing kits from rampwf/tests/kits there and add the file you need. Wouldn't that work?

kegl commented 3 years ago

@albertcthomas If we also add the --import option to ramp-test, one can use it to add . in a local repo not intended for a data challenge, and have utils in the root kit folder, do from utils import .... Then in case we want to turn it into a challenge, we can simply copy utils into external_imports sorramp-test with no option works.

martin1tab commented 3 years ago

Could we do without adding rampwf/tests/kits/titanic_test_external_imports folder with all the data files? For testing it you could for instance create a tempdir, copy one of the existing kits from rampwf/tests/kits there and add the file you need. Wouldn't that work?

OK !!

albertcthomas commented 3 years ago

@albertcthomas If we also add the --import option to ramp-test, one can use it to add . in a local repo not intended for a data challenge, and have utils in the root kit folder, do from utils import ....

yes that would work but for now instead of adding new options and changing sys.path users can use python -m.

I am still hesitating between doing from external_imports.utils import x and from utils import x.

I was just thinking that the former was more explicit and that it would be easier for a new user to know where the import is coming from. However if someone was using python -m (outside a challenge use) and had modules in the root kit dir, copying them to external_imports/ would also require to change the imports to add external_imports.

Maybe users should not use python -m and put all their external imports in external_imports :)

albertcthomas commented 3 years ago

I did a bit of cleaning. If tests pass this is good for me, thanks @martin1tab

albertcthomas commented 3 years ago

@kegl @rth please merge if this is good for you as well

rth commented 3 years ago

Could you please add a what's new entry https://github.com/paris-saclay-cds/ramp-workflow/blob/master/doc/whats_new.rst#050-unreleased?

martin1tab commented 3 years ago

Could you please add a what's new entry https://github.com/paris-saclay-cds/ramp-workflow/blob/master/doc/whats_new.rst#050-unreleased?

Yes, could you recheck please ? @rth