Hi @dachafra and @eiglesias34 , I made some improvement to the python package and docker deployment, let me know if you'll be interested to integrate them
If you are interested I'll add more detail about how to run with docker and/or pip package to the README.md (and we can maybe replace the current Dockerfile with this new one to avoid confusion)
Changes made
Add console_entrypoint to setup.py
Fixed the __main__.py module
Added a Dockerfile.cli file to build a docker image based on the pip package that can be run as CLI or API
Motivations
The official README recommend to install the pip package and use it as a module with python -m:
On the side it is also documented to deploy it with Docker when looking into the GitHub repo Wiki (and it is often necessary to avoid conflict, since RDFizer require rdflib 4 which is quite old)
Moreover the workflow used by the docker deployment is more complex than it needs to be:
What it needs to do: call the semantify() function with the config file path.
What is currently does: a Dockerfile is built without using the actual rdfizer package (evem if it is the recommended way to use according to the docs), the Docker image uses an app.py script to start an API. When triggered with a curl call, the API runs a system call to run another run_rdfizer.py python script that finally runs the semantify() function
@app.route('/graph_creation/<path:config_file>', methods=['GET','POST'])
def rdfgraph(config_file):
os.system("python3 /app/rdfizer/run_rdfizer.py /" + config_file)
return "The file has been semantified " + str(config_file) + "\n"
Note that the current Dockerfile uses python:3.5 which is not supported anymore, and it also contradicts the requirements of the package in setup.py (so the package cannot be installed in the current dockerfile):
python_requires='>=3.6',
I used python:3.8 in the new Dockerfile and it seems to work fineThis adds a lot of complexity without improving the reproducibility of the software, and it also create 2 different deployment methods to maintainI improved the rdfizer package to make it a CLI, so you don't need to call it with python3 -m everytime when installing it locally (this just requires to add an entrypoint to the setup.py )I added a new Dockerfile.cli image that build from the python package and can be run as a CLI command
Run (change $(pwd) by ${PWD} on windows to use the current working folder):
docker run -it --rm -v $(pwd):/data ghcr.io/vemonet/rdfizer:latest -c config.ini
The rdfizer can still be run as python3 -m rdfizer -c config.ini , but it can be also run as rdfizer -c config.ini , or directly docker runYou can also start the API from my docker image:
docker run -it --rm -v $(pwd)/example:/data --entrypoint python ghcr.io/vemonet/rdfizer:latest /app/app.py
Let me know if you are interested in those types of changes, and if there is anything you would like to see differently
Hi @dachafra and @eiglesias34 , I made some improvement to the python package and docker deployment, let me know if you'll be interested to integrate them
If you are interested I'll add more detail about how to run with docker and/or pip package to the
README.md
(and we can maybe replace the currentDockerfile
with this new one to avoid confusion)Changes made
setup.py
__main__.py
moduleDockerfile.cli
file to build a docker image based on the pip package that can be run as CLI or APIMotivations
The official README recommend to install the pip package and use it as a module with
python -m
:On the side it is also documented to deploy it with Docker when looking into the GitHub repo Wiki (and it is often necessary to avoid conflict, since RDFizer require rdflib 4 which is quite old)
Moreover the workflow used by the docker deployment is more complex than it needs to be:
What it needs to do: call the
semantify()
function with the config file path.What is currently does: a
Dockerfile
is built without using the actualrdfizer
package (evem if it is the recommended way to use according to the docs), the Docker image uses anapp.py
script to start an API. When triggered with acurl
call, the API runs a system call to run anotherrun_rdfizer.py
python script that finally runs thesemantify()
functionNote that the current Dockerfile uses
python:3.5
which is not supported anymore, and it also contradicts the requirements of the package insetup.py
(so the package cannot be installed in the current dockerfile):I used
python:3.8
in the new Dockerfile and it seems to work fineThis adds a lot of complexity without improving the reproducibility of the software, and it also create 2 different deployment methods to maintainI improved therdfizer
package to make it a CLI, so you don't need to call it withpython3 -m
everytime when installing it locally (this just requires to add an entrypoint to thesetup.py
)I added a newDockerfile.cli
image that build from the python package and can be run as a CLI commandUsage
Build:
Run (change
$(pwd)
by${PWD}
on windows to use the current working folder):The
rdfizer
can still be run aspython3 -m rdfizer -c config.ini
, but it can be also run asrdfizer -c config.ini
, or directlydocker run
You can also start the API from my docker image:Let me know if you are interested in those types of changes, and if there is anything you would like to see differently