INCATools / ontology-development-kit

Bootstrap an OBO Library ontology
http://incatools.github.io/ontology-development-kit/
BSD 3-Clause "New" or "Revised" License
228 stars 53 forks source link

Schnapsidee: create a python utility odk-toolbox #908

Open matentzn opened 1 year ago

matentzn commented 1 year ago

That wraps the ODK using https://pypi.org/project/docker/.

This will allow us to avoid having to share odk runner scripts around, and would enable.. platform independent runner scripts??

Schnapsidee ("An impractical idea which seems brilliant when one is drunk.")?

gouttegd commented 1 year ago

Sounds like the same kind of idea I had at the bottom of ticket #877:

A third component that could be maintained separately is the src/ontology/run.sh script, which is becoming more and more complicated (and whose feature are partially duplicated in seed-via-docker.sh; also, the Windows variant is systematically left behind). I envision some kind of odk-runner tool that would transparently run commands using whatever “toolbox backend” is available on the system (e.g. using natively installed tools when present, or Docker, or Singularity).

(… except that I was not drunk when I wrote that ticket, so I can’t use that as an excuse.)

matentzn commented 1 year ago

YES! I didnt connect these dots!

gouttegd commented 1 year ago

That odk-runner or odk-toolbox would be intended to replace all the various ODK launch scripts out there. Both those provided and used by the ODK itself (the seed-via-docker.sh script used to initialise a brand new repo, the run.sh script generated by the seeding process, the run.bat script used on Windows) and the custom launch scripts that people have undoubtedly created in the wild.

This would notably ensure that the Windows variant is no longer continuously left behind in terms of features.

With such a “universal ODK launcher” available, the seeding process of the ODK could still generate a src/ontology/run.{sh,bat} script, but that script would be a one-liner that just calls the real ODK launcher.

Now the first big problem of such a project would be: in which language should it be done? I argue that it MUST NOT be in Python, for two reasons:

1) We cannot count on Python being available on Windows; if Windows users must install ActivePython or another Python distribution in order to be able to use the ODK, this is a huge barrier.

2) Installing Python scripts and their dependencies is always a mess. We are well-placed to know that, since we are craming more and more Python tools in the ODK (e.g. sssom-py or OAK) precisely so that people don’t have to go through the hassle of installing them. If people must now install a Python script and its dependencies (such as this docker package) in order to launch the ODK in order to run more Python scripts… Well, that is self-defeating.

Ideally we would want a scripting language for which an interpreter is available natively on (at least) GNU/Linux, macOS, and Windows. Problem is, I am not sure such a language even exists…

I am more and more thinking that we might have to choose a compiled launcher (maybe written in one of those fancy modern languages such as Rust or Go). Sure, we would have to provide different compiled binaries for all the systems we want to support, but at least the binaries would run natively without requiring the installation of an interpreter.

gouttegd commented 1 year ago

No matter the language we use in the end, I don’t think we can realistically create such a “universal launcher” until/unless we have a committed ODK developer on Windows — I can write the launcher for all platforms and test it on GNU/Linux and macOS but if it is intended to be really universal it will have to be tested on Windows as well, and that I cannot do.

In fact the main reason the run.bat script is systematically left behind is not so much that I don’t know how to update it (Windows’ batch script is not that hard), but much rather that I don’t dare to update it since I cannot test whether my changes work as expected.

matentzn commented 1 year ago

Ok I will talk to @ehartley and her team, maybe we can also recruit @StroemPhi for this; but for now, I am still not convinced about the arguments against python. The nice thing would be that there are plenty of materials out there on how to install - we don't have to feel responsible for people getting set up with python.

Let's discuss at the next tech call!

gouttegd commented 1 year ago

we don't have to feel responsible for people getting set up with python.

Up to you. I don’t care, I use a decent operating system. Happy to let you deal with the fallout of users who won’t know how to pass the first step or who would end up with a broken pip registry.

gouttegd commented 1 year ago

At the very least, if we end up writing the launcher in Python, we should try hard to only depend on Python’s standard library. In particular, I don’t see the point on depending on that docker package, while all the launcher needs to do is to spawn a docker run command. The standard subprocess package should be more than enough for what we need.

StroemPhi commented 1 year ago

Moin, just wanted to say that I'd be happy to test things on Windows, but IDK if this makes sense, if I'm using ODK in a UbuntuVM due to me not being allowed to install Docker on my Windows host system.

gouttegd commented 1 year ago

I just did a quick test. Windows 10 Pro is still not provided with a Python interpreter by default, but trying to invoke python from the command line automatically spawns the “Windows Store” to download a fresh distribution of Python. So the hurdle may not be as high as I thought…

StroemPhi commented 1 year ago

Yet I wouldn't recommend installing Python via the MS store: https://dev.to/naruaika/why-i-didn-t-install-python-from-the-microsoft-store-5cbd.

gouttegd commented 1 year ago

Not entirely convinced by this. The point about the pip found in the PATH being from a different Python installation than the python in the PATH is a common problem that is not specific to the Windows Store installation; in fact it’s so common that AFAIK Pip developers strongly recommend never calling pip directly (you should call python -m pip instead) precisely for that reason.

And in any case, if we do write the launcher script in Python, I stand by my previous message saying that the script should be self-contained and not depend on any non-standard package, so that users should never need to call pip directly or indirectly.