petterreinholdtsen / noark5-tester

Test Noark 5 Core REST API
3 stars 3 forks source link

Unique duplicate rel in _links clarification/false positive? #35

Open ivaylomitrev opened 1 year ago

ivaylomitrev commented 1 year ago

We get hundreds of the errors produced by the following statement

self.failure("unique duplicate rel %s in _links for %s" % (rel, url))

An example of journalpost looks like this:

failure: unique duplicate rel https://rel.arkivverket.no/noark5/v5/api/sakarkiv/journalpost/ in _links for {DOMAIN}/n5ws/v1/struktur/journalpost/?$filter=klasse%2FsystemID+eq+%27cefd9d97-1bd3-4fa7-b34c-86067e8e81f9%27&$top=10&$skip=0
failure: unique duplicate rel https://rel.arkivverket.no/noark5/v5/api/sakarkiv/journalpost/ in _links for {DOMAIN}/n5ws/v1/struktur/journalpost/?$filter=klasse%2FsystemID+eq+%2714c12b9c-c0dc-46e0-b912-ea0733ce3d36%27&$top=10&$skip=0

This seems to be because the same rel is present for various klass results in the result setpointing to their related journalposter (which seems to be a valid case).

Some help resolving the confusion would be highly appreciated. Is this due to a misunderstanding of the specification on our side or is that a valid scenario not covered by the tester tool?

petterreinholdtsen commented 1 year ago

[ivaylomitrev]

This seems to be because the same rel is present for various klass results in the result setpointing to their related journalposter (which seems to be a valid case).

If you can provide the JSON for one of the problematic results, I will be able to give a more trustwordy response.

I had a look at one of the files from one of my export-all runs, .../struktur/registrering/?%24filter%3Dklasse%252FsystemID%2Beq%2B%25271b63366a-c339-44d2-b4e4-bdf64ddb6594%2527%26%24top%3D10%26%24skip%3D0.json. In this file, I see these relation keys:

"_links": { "https://rel.arkivverket.no/noark5/v5/api/metadata/tilgangsrestriksjon/": { "href": "https://gw-ntnu.intark.uh-it.no/documaster/n5ws8091/n5ws/v1/kodelister/tilgangsrestriksjon/" }, "https://rel.arkivverket.no/noark5/v5/api/metadata/dokumentmedium/": { "href": "https://gw-ntnu.intark.uh-it.no/documaster/n5ws8091/n5ws/v1/kodelister/dokumentmedium/" }, "https://rel.arkivverket.no/noark5/v5/api/arkivstruktur/registrering/": { "href": "https://gw-ntnu.intark.uh-it.no/documaster/n5ws8091/n5ws/v1/struktur/registrering/{?$filter&$orderby&$top&$skip}", "templated": true }, "self": { "href": "https://gw-ntnu.intark.uh-it.no/documaster/n5ws8091/n5ws/v1/struktur/registrering/" }, "https://rel.arkivverket.no/noark5/v5/api/metadata/kassasjonsvedtak/": { "href": "https://gw-ntnu.intark.uh-it.no/documaster/n5ws8091/n5ws/v1/kodelister/kassasjonsvedtak/" } }

I note that the 'self' link do not match the search used to generate the list, and thus is both not to be trusted for use with deletion, and is not really the correct 'self' link to reproduce the result.

Further, I note that the self href entry do not have a identical entry with a official release key. I suspect a more sensible _links result for a search result would drop the kode lists relations and look something like this:

"_links": { "https://rel.arkivverket.no/noark5/v5/api/arkivstruktur/registrering/": { "href": "https://gw-ntnu.intark.uh-it.no/documaster/n5ws8091/n5ws/v1/struktur/registrering/?%24filter%3Dklasse%252FsystemID%2Beq%2B%25271b63366a-c339-44d2-b4e4-bdf64ddb6594%2527%26%24top%3D10%26%24skip%3D0" }, "self": { "href": "https://gw-ntnu.intark.uh-it.no/documaster/n5ws8091/n5ws/v1/struktur/registrering/?%24filter%3Dklasse%252FsystemID%2Beq%2B%25271b63366a-c339-44d2-b4e4-bdf64ddb6594%2527%26%24top%3D10%26%24skip%3D0" }, }

Some help resolving the confusion would be highly appreciated. Is this due to a misunderstanding of the specification on our side or is that a valid scenario not covered by the tester tool?

I suspect it is a valid issue, but hard to know for sure without seeing the details involved.

-- Happy hacking Petter Reinholdtsen

ivaylomitrev commented 1 year ago

I will provide some more information tomorrow. I do acknowledge the issue with the self rel (and, more importantly, its accompanying type rel) returned on the root level of the filter response. I want to run some tests to see if modifying that would fix the issue.

My original impression was that this failure was raised due to the presence of multiple dokumentbeskrivelse in a result set each pointing to different registrering parents (for example), but I am starting to doubt this assumption after a debugging session.

I still think there might be an issue to fix in the tool, because the same rel key (../arkivstruktur/registrering) will be returned both in the _links of dokumentbeskrivelse results and in separate registrering filter requests all with different values, but I will provide concrete examples tomorrow.

ivaylomitrev commented 1 year ago

I managed to debug this and see what happens.

  1. We enter the recursive hateoas method
  2. It invokes the Root API URL which invokes the sakarkiv and arkivstruktur URLs
  3. When processing sakarkiv, the https://rel.arkivverket.no/noark5/v5/api/sakarkiv/journalpost/ rel is encountered which returns a templated .../journalpost url on our end. The templating is dropped and the URL is stored as an href for rel https://rel.arkivverket.no/noark5/v5/api/sakarkiv/journalpost/
  4. Later, a GET .../journalpost request is made as part of recursing through all encountered URLs
  5. This returns 5 registry entries where the root-level _links in the response contain the https://rel.arkivverket.no/noark5/v5/api/sakarkiv/journalpost/ rel again, this time pointing to href .../journalpost?$top=10&$skip=0 (which is the URL used to produce this result set).

Since the hrefs are different, the aforementioned failure is thrown:

self.failure("unique duplicate rel %s in _links for %s" % (rel, url))

Even if not for this URL though, another discrepancy is encountered later when processing dokumentbeskrivelse GET requests which also contain a rel 'https://rel.arkivverket.no/noark5/v5/api/sakarkiv/journalpost/' in the _links which in our case points to:

.../journalpost/first/?$filter=dokumentbeskrivelse/any(e:e/systemID eq '0b9b3733-f2c9-46f3-8d02-71b520109bfc')&$top=1&$skip=0'

(decoded for better readability) which also conflicts with the one originally stored in self.rels.