protegeproject / protege

Protege Desktop
http://protege.stanford.edu
Other
971 stars 229 forks source link

IDs for newly create terms are reused between PRs #1108

Closed sabrinatoro closed 6 months ago

sabrinatoro commented 1 year ago

I created several new terms using Protege, each term has been added in separate PRs. As these PRs are being merged, others are showing conflicts because the same ID (automatically created in Protege) was reused between un-merged PR.

For example: MONDO:0700215 was used to create 'neoplastic meningitis' here: https://github.com/monarch-initiative/mondo/pull/5972/files This PR has not been merged yet.

MONDO:0700215 was used to create another term 'neonatal sepsis' here: https://github.com/monarch-initiative/mondo/pull/5978 Note that this PR was created after the one above, and before the one above was merged.

MONDO:0700215 was also used a 3rd time, in another PR https://github.com/monarch-initiative/mondo/pull/5971

So the first PR that is accepted will create conflict for all the other PR.

It looks like Protege only takes into account the ID used in Master, and doesn't take into consideration which one has been used previously, in another session of Protege. Please let me know if you need more details or screenshots,... Thank you!

gouttegd commented 1 year ago

Can you check whether the Remember last ID between Protégé sessions setting is enabled? Screenshot 2023-02-27 at 20 57 43

sabrinatoro commented 1 year ago

It is : Screenshot 2023-02-27 at 1 09 34 PM

gouttegd commented 1 year ago

OK, looks like a bug then. I wonder if the ID range policy feature somehow interferes with the “remember-last-ID” feature. I’ll need to do some tests.

gouttegd commented 1 year ago

OK, I can reproduce here. As I suspected, this is caused by the added support for ID range policy files. Removing the -idranges.owl file (or renaming it, so that it is ignored) restores the expected behaviour of the “remember last ID” feature.

The problem

The “remember last ID” feature works by saving the last generated ID + 1 into the “Entity Creation Preferences” as the new “Start” value (overriding what may have been manually configured by the user the last time they went to the Settings dialog).

The “ID range policy support” feature works by automatically updating the “Entity Creation Preferences” to set the “Start” value to the lower bound found in the ID range policy (and the “End” value to the corresponding upper bound) – thereby overriding both what may have configured manually and what may have been saved by the “remember last ID” feature.

So the scenario goes like this:

1) Open Protégé on an ontology that has an associated ID range file. Protégé automatically sets the “Start” value for the automatic ID generator to the lower bound of the range assigned to the user (let’s say it’s 1000).

2) The user creates a new entity on a branch. The entity is assigned the ID 1000 and Protégé saves 1001 as the new “Start” value. The user ends their session.

3) The user switches to another branch, starts a new Protégé session and re-opens the ontology. Protégé again automatically sets the “Start” value to the lower bound of the range assigned to the user (1000), overriding the value that was saved in step 2.

4) When the user creates a new entity in that branch, the entity is automatically assigned an ID of 1000.

Basically, the “remember last ID” feature and “support for ID range policy files” feature are incompatible – the ID range policy always takes precedence over the remembered last ID. :(

Possible immediate workaround for users

When an editor is working on the same ontology for a prolonged period a time, and working on several branches concurrently, they may consider temporarily renaming the -idranges.owl file to force Protégé to ignore that file.

Proper fix in Protégé

At the very least, Protégé should offer the option to disable the automatic application of ID range policies. That is, users should be able to tell Protégé to ignore any -idranges.owl file that may be present. Currently, Protégé automatically uses such a file if present without, AFAIK, any possibility of opting out.

This would at least allow users to choose if they prefer not having to reconfigure their ranges when they switch between ontologies, or if they prefer not risking any ID clashes when they switch between branches in the same ontology.

The real solution would be to make the “remember last ID” feature compatible with the application of ID range policies. This would imply changing completely the way the last ID is “remembered”. Instead of remembering a single last ID, we should remember one last ID for each policy.

cmungall commented 8 months ago

Is our interim recommendation to do what @balhoff did in https://github.com/geneontology/go-ontology/pull/25130, and do this proactively over a number of repos? Many people won't have seen this issue reported and may be getting themselves into problems

gouttegd commented 8 months ago

I may have a fix in sight, but there’s no telling when it will be ready.

In the meantime, I don’t have any other workaround than the one suggested above and used by Jim on GO (forcibly disabling the feature by renaming the -idranges.owl file), sorry.

The only other option is for users to select Cancel when Protégé shows them the ID range selection dialog – Protégé will then leave the AUTO_ID_START setting untouched, meaning that any previous value saved from working in another branch will be correctly remembered. But this requires users to know that they have to do that.

gouttegd commented 8 months ago

The proper fix is farther away than I hoped because the ID range policy feature creates another related problem.

Let’s say you open ontology A. Protégé finds an -idranges.owl policy file and applies it, so any new entity you create in A will use the settings found in that policy file.

Then, you open ontology B in another workspace (while the workspace of ontology A is still open; that’s what happens when you answer “No” to the question “Do you want to open the ontology in the current window“). Again, Protégé finds an -idranges.owl in the directory of ontology B and applies it, so any new entity you create in B will use the correct settings for that ontology.

Now you switch back to the first window (ontology A) and create another new entity. You’ll find that the entity is created using the prefix and ID range from ontology B.

That’s because there is only one preferences store that is shared by all workspaces, so any changes to the settings in one workspace (such as what happens when Protégé applies the policy file from ontology B) affects all other workspaces.

gouttegd commented 8 months ago

There is a proposed fix in #1179. Still under testing for now.

It does not fix the situation highlighted in the last comment (when you’re trying to edit more than one ontology concurrently), because fixing that would require some deep changes in the way Protégé manages the preferences. But it does fix the problem with the “remember last ID between Protégé sessions”.

gouttegd commented 6 months ago

Fix proposed in #1179 has been merged. Will be part of a future release.