rename the root for the code pop2vec, as in "population data to vec", and as a generalization of life2vec because we're also using network data
Make all imports relative to the root. For this, I added import tests in the main modules: llm, graph, evaluation, fake_data
move others/synthetic_data_generation to pop2vec/fake_data
upated the calls to python scripts in the affected slurm scripts.
Tests can be run from project root with
python -m pytest
and the test directories are specified in pyproject.toml
Notes
the import tests fail on several files that don't wrap the code with
if __name__ == "__main__"
they are commented out and marked in the respective test scripts.
there is a bunch of code that is currently not used. for instance, pop2vec.llm.src.strategy. We should clean this up, see #79
in the slurm scripts, I have not yet updated the cd-ing into the relevant directories that comes before the python function calls. We can do this in the context of #39 and #27
To do
I found the following missing packages in our venv:
dask.dataframe
seaborn
dask.distributed
hydra-core
sentence_transformers
[x] Thus, I need to find out on snellius in which modules they are and create the respective txt files; try them out
update pyproject with name
[x] update developer docs with instructions for writing tests
Implications
For future code
tests can be added in the respective test folders
all imports from the repository need to be relative to the root. -> this makes it possible to re-use code across modules
everything in slurm scripts needs to be relative to the project root. Ie, calls to python scripts, as well as any arguments to file paths inside the project.
Problems created when merging this PR
when merged, will probably create merge conflicts in other branches, in particular @benczaja 's work on #74 . After merging, I will coordinate with Ben to rebase the branch to main.
This is a fix for #30
Main changes
pop2vec
, as in "population data to vec", and as a generalization oflife2vec
because we're also using network datallm
,graph
,evaluation
,fake_data
others/synthetic_data_generation
topop2vec/fake_data
Tests can be run from project root with
and the test directories are specified in
pyproject.toml
Notes
pop2vec.llm.src.strategy
. We should clean this up, see #79cd
-ing into the relevant directories that comes before the python function calls. We can do this in the context of #39 and #27To do
update pyproject with nameImplications
For future code
Problems created when merging this PR