Open matentzn opened 5 years ago
I guess I have a lot of thoughts about this. The executive summary is that I don't need most of what ODK offers for my projects, so I don't want to pay the complexity cost. I'd prefer to standardize on outputs, not implementation.
I think OBI is a good example to discuss. OBI uses GNU Make, with ROBOT doing the heavy lifting. The OBI Makefile is about 250 lines with lots of comments and space. It's custom but I don't consider it complex.
OBI releases have always been "full" (merged and reasoned with HermiT). We might want to tweak the full release a little to line up with the emerging standard. I'd like to add a "base" release. According to the current release artifact doc, it looks like that would add about four more lines in the Makefile. I'm not interested in the other four optional release artifacts.
A big chunk of the OBI Makefile is QC, running various SPARQL queries. It looks like they could use some cleanup, but I'm not sure that ODK covers all of OBI's QC. I'm happy enough with the way OBI handles imports, templates, and modules, which seems simpler to me than the ODK way.
The ODK Makefile template is 652 lines. There's a lot more supporting files in the ODK template directory than OBI has. ODK has a bunch of stuff that OBI doesn't need.
I see how ODK helps Nico manage a whole bunch of ontology projects that have a shared history of tools. OBI doesn't share that history. Looking at what ODK has today, I don't see any benefit for OBI switching, but lots of costs. That calculation may change in the future.
Docker is its own thing. Every time I've tried to use Docker in a project, I've regretted it. I'll take a stab at articulating why, knowing full well that I won't convince anyone. Containers are a fine option for lightweight virtualization, but on macOS and Windows we run Docker inside a VM anyway, providing little benefit (please correct me if I'm wrong). I prefer to just use a VM without Docker. The primary benefit people want from Docker is dependency management, but Ansible scripts are much more flexible than a Dockerfile, NixOS is even better, and a humble JAR file is perfect if you can stick to the JVM.
And as far as "ease of use" goes, at OBO tutorials we've had much more success installing ROBOT than installing ODK.
I'm fine with ODK setting the standards for directory layout and recipes like the release artifacts. I can see some benefits to building the release artifacts into ROBOT, but Nico's reasons are good ones while we're figuring out all the details, so I'm not in a rush.
Rather than standardizing on implementation, I'd prefer to standardize on outputs. Let's build testing tools to make sure that OBI's "full" and "base" release artifacts match perfectly with ODK release artifacts. Let's look at harmonizing OBI and ODK's QC queries.
I'll admit that at least part of my disagreement with @matentzn on this topic is probably based on Conway's Law:
organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations.
The differences between ODK and ROBOT have so far reflected differences between @cmungall's projects and my projects. We're friends and we have shared goals for OBO and even some shared funding, but we still have distinct communication structures.
A lot of things to discuss here, but it seems we are broadly aligned on end-goals. Let's standardize project.yaml, standardize the expectations of different module types (https://github.com/information-artifact-ontology/ontology-metadata/pull/36). In theory we can have any number of tools that implement this. I appreciate this creates ambiguity in the short term, but is there anything riding on this? Is someone working on robot release
right now? If not then the current situation of a subset of ontologies using odk and a subset using hand-crafted makefiles can carry on.
Can't resist an observation - 250 line Makefile is nice, but if you have 20 ontologies like OBI that's 5k lines taking up valuable headspace. Unfortunately for those of us working with model organisms with different funding streams we will have multiple similar ontologies to worry about for the foreseeable future.
I think it would help to define the matrix of system objective x target user type and their level of implementation skill. How much knowhow is required to implement either ODK or ROBOT approach in the build process? I like that both systems could work to harmonize on build process outputs.
The first user type x skill I want a solution for is fledgeling ontology builder who may have some agency IT support (who know linux servers but nothing about ontology). (Also the user may be representing a group of curators producing one ontology.) These are the kinds of people I am engaged in training; one can see people have to get past this point to engage in more complex functionality. They'd rather understand the Makefile themselves, rather than have to work with an IT person for whom the ontology part is unknown, but frankly they've never used a Makefile before and can't see the lines that trigger on changes detected in config files for example.
This entry level ontology builder needs:
And importantly - nothing more! Provide a different config file, or one whose sections are obviously ignorable, so that users don't have to ponder what they don't need to turn off or know, which is so much more time consuming. I.e. even if all the code of a complex system is provided, a simple tier 1 path of knowledge. The entry point to both ODK and ROBOT right now don't make clear the minimal information/ settings for the entry level ontology builder.
Then define tier 2 / 3 capabilities & requisite config files.
My 2 cents or pence!
I have a new use case for a shared project.yml
. Should we discuss it here, or a new ODK issue, or maybe an issue on another repo?
I would love to hear it.. I dont mind where the issue is discussed. IMO it belongs on the repo where the ontology md files that drive the website live, but i dont mind.
We previously discussed it in a chain of mails with this subject:
Contemplating centralised management of ODK configs
, but if you also see now a use case: Lets go for it! I will support you strongly!
There are some questions now and then on where ODK starts and the ROBOT ends; to roll out both widely, it would be good to understand the core job each one does and agree.
I am a heavy user of both, so I have some opinions; my main concern is that while the ODK has now widely embraced ROBOT, I think there are still a lot of ad-hoc (non-standard) Makefiles being build with fairly complex ROBOT (or, owltools) pipelines (like OBI, GO, DO, and many more). I am not saying: force everyone in line! (<---!!!!). But I do see the risk of certain aspects of the configuration drifting apart, like release artefact definition (what is a base, what is a simple release), QC coverage (which qc should be run, and on which artefacts, and during which stage of the CI), import management etc. So before going into detail, my general suggestion is this:
o-->[]-->o'
and the QC methods themselves (not law, just rule of thumb).sed
for dropping some ugly OBO format stuff, like owlaxioms section), ODK does no ontology transformation stuff (apart from calling ROBOT).In my experience there are two areas of contention:
Why should you use ODK rather than custom Makefile + ROBOT
The Makefile mainly fulfils the job of defining release files pipelines, report files and QC standards. Most of it should be fairly standard;
docker pull
.In my experience, there is only one good argument against the ODK.
But, since setting up a new repo is trivial, and form many ontologies the amount of custom code is relatively low, I dont see much else. The second action item of this ticket is:
Where should the OBO release artefact definitions live?
Option 1: ODK.
The ODK Makefile contains all release artefacts definitions as ROBOT chains. There are IMHO three main advantages:
Option 2: Robot.
The user can run a simple
ROBOT release --simple
(or similar) to generate a simple release of any ontology (independent of ODK).I am not 100% sure on either side; I just tend to Option 1 at the moment because I love the transparency argument and I am a heavy user of the "Debugging" argument, but I am not sure how general that is here.
Action item: