Open IFFranciscoME opened 1 week ago
Ok, @priyakasimbeg, here is my proposition of more actionable items to start the first phase of refactoring: And is in the Datasets installation/downloading process actually, previous to the overall project.
Problems:
Improvement oportunities:
dataset_setup.py
config to a pyproject.toml
logic. ~/data/
and ~/temp/data
local folder creation.pyproject.toml
file for all datasets.pyproject.toml
specify a dependency list for each dataset.README.md
for each dataset with some of the following:
Description
Improve the approach for packaging and use a
pyproject.toml
approachContext
The common approach for a python project to be used is either a direct way, mostly like cloning the repository, and a compacted way, mostly in the form of packaging and/or containerization. In either case, for the same version of the software, the same functionality should be available for any given system independently on how it was installed.
Problem
When the user decides to go for the package/container route of installing the software, there will be dependency issues not straight forward solvable for some cases (more yet to be mapped). A non exhaustive list of these problems is:
PYTHONPATH
values maybe are not updated programatically.Elements for the solution (Draft)
In general, it might me a good opportunity to update the packaging from a
setup.py
oriented approach topyproject.toml
approach.PEP 518 – Specifying Minimum Build System Requirements for Python Projects
Externally Managed Environments
Some other details might be useful: