CatalogueOfLife / backend

Complete backend of COL ChecklistBank
Apache License 2.0
15 stars 11 forks source link

CLB: editorial decisions do not work #1290

Closed yroskov closed 7 months ago

yroskov commented 7 months ago

I am assessing and resolving Issues for a new version of World Plants.

Today, 2024-02-06, I was not able to apply Complex decision here: https://www.checklistbank.org/catalogue/3/dataset/1141/workbench?facet=rank&facet=issue&facet=status&facet=nomStatus&facet=nameType&facet=field&facet=authorship&facet=authorshipYear&facet=extinct&facet=environment&facet=origin&issue=multi%20word%20epithet&limit=100&offset=0 image

Parsing of the name Gamanthera van der Werff was corrected ("van der Werff" is the author). However, applied decision was not shown in the interface.

Blocking decisions do not work here:
https://www.checklistbank.org/catalogue/3/dataset/1141/workbench?facet=rank&facet=issue&facet=status&facet=nomStatus&facet=nameType&facet=field&facet=authorship&facet=authorshipYear&facet=extinct&facet=environment&facet=origin&issue=inconsistent%20name&limit=100&offset=0

All following names were blocked, but decisions are not shown in the interface: image and image

yroskov commented 7 months ago

What is strange, blocking decisions work perfectly here:

https://www.checklistbank.org/catalogue/3/dataset/1141/workbench?facet=rank&facet=issue&facet=status&facet=nomStatus&facet=nameType&facet=field&facet=authorship&facet=authorshipYear&facet=extinct&facet=environment&facet=origin&issue=indetermined&limit=100&offset=0

yroskov commented 7 months ago

I had a problem to apply "Provisionally Accepted' decision here: https://www.checklistbank.org/catalogue/3/dataset/1141/workbench?decisionMode=_NULL&facet=rank&facet=issue&facet=status&facet=nomStatus&facet=nameType&facet=field&facet=authorship&facet=authorshipYear&facet=extinct&facet=environment&facet=origin&limit=500&nomstatus=manuscript&offset=0&status=accepted

Decision was not shown:

image

{ "code": 400, "message": "org.apache.ibatis.exceptions.PersistenceException: \n### Error updating database. Cause: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint \"decision_dataset_key_subject_dataset_key_subject_id_key\"\n Detail: Key (dataset_key, subject_dataset_key, subject_id)=(3, 1141, Celastrales-Celastraceae-Cassinoideae-Gymnosporia-inflata-(S. J. Pei & Y. H. Li)) already exists.\n### The error may exist in life/catalogue/db/mapper/DecisionMapper.xml\n### The error may involve life.catalogue.db.mapper.DecisionMapper.create-Inline\n### The error occurred while setting parameters\n### SQL: INSERT INTO decision (id, dataset_key, subject_dataset_key, subject_id, subject_name, subject_authorship, subject_rank, subject_code, subject_status, subject_parent, note, modified, modified_by, mode, name, status, extinct, temporal_range_start, temporal_range_end, environments , original_subject_id, created_by ) VALUES (nextval('decision_3_id_seq'), ?, ?, ?, ?, ?, ?::RANK, ?::NOMCODE, ?::TAXONOMICSTATUS, ?, ?, now(), ?, ?::EDITORIALDECISION_MODE, ?::jsonb, ?::TAXONOMICSTATUS, ?, ?, ?, ? , ?, ? )\n### Cause: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint \"decision_dataset_key_subject_dataset_key_subject_id_key\"\n Detail: Key (dataset_key, subject_dataset_key, subject_id)=(3, 1141, Celastrales-Celastraceae-Cassinoideae-Gymnosporia-inflata-(S. J. Pei & Y. H. Li)) already exists." }

gdower commented 7 months ago

@mdoering, when I tried to do a complex decision I got:

### Error updating database.  Cause: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "decision_dataset_key_subject_dataset_key_subject_id_key"
  Detail: Key (dataset_key, subject_dataset_key, subject_id)=(3, 1141, Laurales-Lauraceae-Licaria-Laurales-Lauraceae-Licaria-Gamanthera van der Werff) already exists.
### The error may exist in life/catalogue/db/mapper/DecisionMapper.xml
### The error may involve life.catalogue.db.mapper.DecisionMapper.create-Inline
### The error occurred while setting parameters
### SQL: INSERT INTO decision (id,       dataset_key,           subject_dataset_key,     subject_id,     subject_name,     subject_authorship,     subject_rank,     subject_code,     subject_status,     subject_parent,     note,     modified,     modified_by,     mode,     name,     status,     extinct,     temporal_range_start,     temporal_range_end,     environments        , original_subject_id, created_by )     VALUES (nextval('decision_3_id_seq'),       ?,     ?,     ?,     ?,     ?,     ?::RANK,     ?::NOMCODE,     ?::TAXONOMICSTATUS,     ?,     ?,     now(),     ?,     ?::EDITORIALDECISION_MODE,     ?::jsonb,     ?::TAXONOMICSTATUS,     ?,     ?,     ?,     ?    , ?, ? )
### Cause: org.postgresql.util.PSQLException: ERROR: duplicate key value violates unique constraint "decision_dataset_key_subject_dataset_key_subject_id_key"
  Detail: Key (dataset_key, subject_dataset_key, subject_id)=(3, 1141, Laurales-Lauraceae-Licaria-Laurales-Lauraceae-Licaria-Gamanthera van der Werff) already exists.

I also tried configuring the parser to handle it correctly but I get a 403 with the correct bearer token provided:

POST /parser/name/config
{
  "scientificName": "Gamanthera van der Werff",
  "authorship": "van der Werff",
  "rank": "genus",
  "genus": "Gamanthera",
  "type": "scientific"
}
yroskov commented 7 months ago

Another example from WWW:

Attempt to block these names failed (code 400): https://www.checklistbank.org/catalogue/3/dataset/1162/workbench?facet=rank&facet=issue&facet=status&facet=nomStatus&facet=nameType&facet=field&facet=authorship&facet=authorshipYear&facet=extinct&facet=environment&facet=origin&limit=50&nomstatus=manuscript&offset=0

mdoering commented 7 months ago

Both the 400 SQL errors just say that there is a decision already for that name in the project, so you cannot create another one - which is correct. Question is why they dont show up...

mdoering commented 7 months ago

Case Gamanthera van der Werff:

Your links above all have a filter with nomstatus=manuscript in the URL. Any idea where that is coming from?

With the status filter it obviously doesnt yield anything: https://www.checklistbank.org/catalogue/3/dataset/1141/workbench?facet=rank&facet=issue&facet=status&facet=nomStatus&facet=nameType&facet=field&facet=authorship&facet=authorshipYear&facet=extinct&facet=environment&facet=origin&limit=500&nomstatus=manuscript&offset=0&q=Gamanthera

But without it does: https://www.checklistbank.org/catalogue/3/dataset/1141/workbench?q=Gamanthera ... but still without a decision, an API problem.

And @thomasstjerne I am unable to add any decision in the workbench as described here before: https://github.com/CatalogueOfLife/checklistbank/issues/1313#issuecomment-1829432583

The decision does exist though: https://www.checklistbank.org/catalogue/3/decision?limit=100&name=Gamanthera%20van%20der&offset=0&subjectDatasetKey=1141

mdoering commented 7 months ago

The workbench interface is driven by the ES search, not from the database like the duplicate search. That means there can be data inconsistencies if sth fails and I suspect this is what we see here. The World Plants decision was stored in the database, but the ES search index was not updated properly.

I am reindexing WP as I type, lets see if that solves it.

mdoering commented 7 months ago

@gdower I have posted your parser config json to the API and it worked. It requires an ADMIN role to be able to modify the configs. As this doesnt happen often, could you just send me those configs to be added?

mdoering commented 7 months ago

reindexing solved it, decisions are now shown: https://www.checklistbank.org/catalogue/3/dataset/1141/workbench?facet=rank&facet=issue&facet=status&facet=nomStatus&facet=nameType&facet=field&facet=authorship&facet=authorshipYear&facet=extinct&facet=environment&facet=origin&limit=50&offset=0&q=Gamanthera

mdoering commented 7 months ago

also reindexing WWW. @yroskov anything else?

yroskov commented 7 months ago

Thank you, @mdoering!

That's all for now

yroskov commented 7 months ago

@mdoering, I met the same problem today with Entiminae GSD: https://www.checklistbank.org/catalogue/3/dataset/1166/workbench?decisionMode=_NULL&facet=rank&facet=issue&facet=status&facet=nomStatus&facet=nameType&facet=field&facet=authorship&facet=authorshipYear&facet=extinct&facet=environment&facet=origin&limit=70&offset=0&reverse=true&sortBy=name

Blocking decisions are not shown in the interface. When I tried to block additional 69 names with portion [GENUS NOT SPECIFIED], nothing happened.

image

mdoering commented 7 months ago

@yroskov @gdower I have reindexed the dataset. In case you see this more often you should be able to re-index datasets yourself in the Options page of a dataset: https://www.checklistbank.org/dataset/1166/options

yroskov commented 7 months ago

Thank you, @mdoering!

Working with SF Orthoptera, I met the same problem (all names here should get decision "Ignore" https://www.checklistbank.org/catalogue/3/dataset/1021/workbench?facet=rank&facet=issue&facet=status&facet=nomStatus&facet=nameType&facet=field&facet=authorship&facet=authorshipYear&facet=extinct&facet=environment&facet=origin&limit=900&offset=0&q=supersp).

When I tried to re-index dataset, I got message Request failed with status code 403 User not authorized.

image

mdoering commented 7 months ago

reindexed