Closed jschnasse closed 9 years ago
In the data there exists "fulltextOnline" : "http://digitool.hbz-nrw.de:1801/webclient/DeliveryManager?pid=1638893&custom_att_2=simple_viewer"
Is this wrong?
[edit:] Ah now I see you mean: The URL of the origin of the archived page, not the URL of the archive. Thus, http://www.rhein-lahn-info.de/jakobsweg/
is missing in the lobid data.
correct! Not the link to the archived resource is missing but the link to the resource that has been archived. :-)
Till today we only took the URl in 655eu into account if the same entity had also a note about being an archived resource - see http://lobid.org/resource?id=HT014997977&format=source as an example. With the new mapping URLs in 655eu will be stored as lv#fulltextOnline
even if these URLs are not tagged as beeing archives (via their 655e-subfields) but also when 652a="Archiv.." exists.
It may be that this leads to wrong assumption of URLs , i.e. their qualification as lv#fullextOnline
. Tests so far look good. We will see. (Honestly, dealing with "online resources" and the according fields (655e, 652 , 334 ... ) et al. is messy).
After discussion with @jschnasse :
We need a proper property from @acka47 to make the statement <A> <archives> <B>
. Reusing lv#fulltextOnline
is not adequate.
I will create a class lv:ArchivedWebPage
to be used with all edoweb resources as well as a property lv:webPageArchived
, ok?
sounds reasonable! +1
@dr0i To Do:
lv:fulltextOnline
for links between lobid resource and archived webpage use lv:webPageArchived
.http://data.archiveshub.ac.uk/def/ArchivalResource
to edoweb resources use lv:ArchivedWebPage
(see the morph in line 1217).Deployed to staging. @jschnasse have a look.
ping @jschnasse
Deployed even to production. @jschnasse have a look and rise some thumbs if it does what it should.
What's missing: Add lv:webPageArchived
to the JSON-LD context. Will do.
test import is running. looks good so far! Many thx.
+1
We obiously overshot the mark here by typing all Edoweb resources lv:ArchivedWebPage
. Only resources with a URL in 655e should be typed as such as reported by @jschnasse . That's why I re-opened this issue.
Here I need an example resource. HT014997977 and HT018433961 are in the samllest test set and they are proper ArchivedWebPages.
Examples: HT018585406, HT018585452, HT018585477.
See also this JIRA issue: https://jira.hbz-nrw.de/browse/EDOZWO-480.
So even if 652 states that it's about an "archivierte online resource" it's not (necessarily). (A librarian's rose can be anything.)
@acka47 you may be satisfied with having a look at the ntriples in the outcome of the new transformation in the smallest test set (spares us the time of deploying to staging) . Look at HT018585406
. Now this resource is no more of type lv:ArchivedWebPage
.
+1
@jschnasse just let me know that we will have to inform @literarymachine as soon as this is deployed on production.
Thanks! Team lunch communication. Better than any ticketing system.
In this case, @jschnasse used the other ticketing system (JIRA) to communicate this. Lunch was good and very sunny, though.
@dr0i The latest fix (i.e. only type resources as lv:ArchivedWebPage
that have with a URL in 655e) hasn't been deployed to production yet. See http://lobid.org/resource/HT018585406 which should NOT be of type lv:ArchivedWebPage
.
In the meantime (at 20150415) the metadata of the resource has changed , i.e. e.g. http://lobid.org/resource/HT018585406 has an URL in 655e in thus is (correctly) of type lv:ArchivedWebPage
.
The principle mechanism coded with https://github.com/lobid/lodmill/commit/ed809ffe4ef9f6d7f500525a5f93d0953b313b93 to exclude the type lv:ArchivedWebPage
if there is no 655e is also working , as the unit tests don't bring up this type when working on old HT018585406's metadata (where 655e wasn't configured).
Other resources have changed their metadata also, at least that's true for HT018585477.
Deployed to staging and production (since yesterday with lobid/lodmill#669).
@jschnasse Can you point me to a current example of an edoweb resource without a URL in 655e?
Just talked to @jschnasse . This issue has gotten a bit out of hand. We decided to completely revert the addition of type v:ArchivedWebPage
(i.e. no resource at all shall be typed as such) and stick to the issue title which we already have reached (i.e. adding the URL of the web page that has been archived to the RDF.
We may get the information whether something is an archived web page from MAB field 051, element 1, see e.g. http://lobid.org/resource?id=TT002234459&format=source where there's a w
in element 1:
<controlfield tag="051">mw||||||</controlfield>
.
Hi, just to clarify, there are archived webpages in the catalogue and it makes sense to type them as such. IMO it seems to be a bad idea to override explicit types if no actual use case is in sight. I will have an eye on this in future issues. If the 'w' indicates a type "archivedWebpage" it would add a valuable information to the dataset. Unfortunately this type seems to be completely undocumented. In any case it would be better to open a new ticket for the type thing.
In any case it would be better to open a new ticket for the type thing.
I did so with #152.
Just for the record. The current index is not sufficient for edoweb, because everything with an entry "archivierte Langzeitresource" in Mab 655e is now typed as "archivedWebpage" which is wrong and also not in accordance with our routine of title imports. Our frontend uses the types to filter search results against lobid, e.g if a user wants to import a title for a monograph, only certain types in lobid are displayed back to the user. For more information please ask @literarymachine.
Deployed to staging. lv:ArchivedWebPage
is no more. Please test.
+1
@jschnasse Please acknowledge if/when this should be made productive.
With last commit the presumption "archived => cannot be of type book" is removed, which is good e.g. for http://lobid.org/resource/HT018585406 but bad for http://lobid.org/resource/TT002234459. Last one will be handled by #152.
deployed to staging
looks fine!
Deployed to production, closing.
e.g.: http://lobid.org/resource/TT002234459/about
TT002234459 describes an archived web page. The url of the archived page is stored in MAB 655e which currently does not exist in the lobid data.
priority in edoweb: high