Closed dominiquesydow closed 4 years ago
Hi @jaimergp,
Could you please review the object-oriented API I am suggesting here?
A few months ago, we talked about requests
-like sessions - this is how I would implement the idea.
Please wait with the review - I will move the code from the notebook to an actual module.
@jaimergp, now the API is ready for review :)
Good morning @jaimergp,
I have a quick question on best practices.
In opencadd.databases.klifs
, we need to streamline column names for all the different remote and local tables (it is a wild mix of names).
I defined the renaming here at the top of the utils
module:
https://github.com/volkamerlab/opencadd/blob/databases_klifs_api/opencadd/databases/klifs_new/utils.py
Is it ok to define the mapping (two large dictionaries) in the utils
module or should this live in e.g. a json file (if so, where would that json file go)?
The mapping can (and I'd even say should) live on a Python file, so your first idea is spot on :) Since this is a static definition, it might be even worth its own module, like opencadd.databases.klifs.definitions
or .schema
.
Nice! I like schema
a lot! Thank you!
Hi @jaimergp,
This PR is almost ready to be merged - with one question open (see PR description).
The tutorial shows more or less all the functionalities of this module: https://github.com/volkamerlab/opencadd/blob/databases_klifs_api/docs/tutorials/databases_klifs.ipynb
Should we do a full code review here before merging to master
? How do you propose to do this since the PR is pretty big?
I cannot use file directly in my unit test files (using currently name instead). file does not return absolute paths to my data, what am I missing? Maybe missing `init.py' files?
Can you link to specific instances where this is happening? Sorry, I am catching up on all notifications and there's a looot to cover :)
@jaimergp, I figured it out - I think :)
pytest
is always run from the package top level directory, here opencadd/
.
That means whenever I use __name__
in file paths (to load test data), I need to set the path from that package top level directory to the directory with my test files.
This will work:
PATH_TEST_DATA = (
Path(__name__).parent / "opencadd" / "tests" / "databases" / "data" / "KLIFS_download"
)
This will not (since I am omitting a few subdirectories between the package top level directory and the test data directory):
PATH_TEST_DATA = (
Path(__name__).parent / "data" / "KLIFS_download"
)
However this will work, since __file__
really gets the path from the unit test file, not the package top level directory:
PATH_TEST_DATA = (
Path(__file__).parent / "data" / "KLIFS_download"
)
Let me know if you want a different opencadd/opencadd/tests
directory structure.
At the moment:
tests/
databases/
data/
KLIFS_download/
test_klifs_submodule.py
structures/
test_superposition_submodule.py
We also could move data/
to tests/
like this?
tests/
data/
klifs/ # Use here module names?
databases/
test_klifs_submodule.py
structures/
test_superposition_submodule.py
I prefer a single, top-level tests.data
directory! Easier to manage. Inside that directory we can stick to the package hierarchy or just a flat list if the amount of files is not huge. I usually default to start simple if possible (flat list here).
I prefer a single, top-level tests.data directory! Easier to manage. Inside that directory we can stick to the package hierarchy or just a flat list if the amount of files is not huge. I usually default to start simple if possible (flat list here).
Perfect!
We now have the following:
tests/
data/
klifs/ # Contains the KLIFS files in the folder structure as set in the KLIFS download
HUMAN/
MOUSE/
KLIFS_export.csv
KLIFS_metadata.csv
overview.csv
databases/
test_klifs_submodule.py
structures/
test_superposition_submodule.py
Merge this branch into add_io_klifs_subpockets
branch, see PR #44.
Description
Refactor
klifs
module with object-oriented API for remote and local requests, using remote and local session similar torequests
sessions as proposed by @jaimergp in here.New module structure:
Todos
api
moduleSession
class - and quick-access functionssetup_remote()
andsetup_local()
schema
modulecore
moduleKinasesProvider
,LigandsProvider
,StructuresProvider
,BioactivitiesProvider
,InteractionsProvider
,PocketsProvider
,CoordinatesProvider
containing (empty) class methods that will be used in bothremote
andlocal
modulesremote
andlocal
modules (hence do not define docstrings again in child classes).BaseProvider
parent class that defines class methods that will be used in all...Provider
classes mentioned beforeremote
andlocal
modulesKinases
,Ligands
,Structures
,Bioactivities
,Interactions
,Pockets
,Coordinates
; fill content to class methodsSessionInitializer
class tolocal
module that creates metadata database from local metadata KLIFS files.utils
moduleQuestions
opencadd.databases.klifs.core
module sensible? I wanted to define the class methods that are to be expected in bothremote
andlocal
modules - however I cannot add actual code to the superclass methods, since the code will differ quite a bit between remote and local requests. > Ok.__file__
directly in my unit test files (using currently__name__
instead).__file__
does not return absolute paths to my data, what am I missing? Maybe missing `init.py' files?Status