Closed zuphilip closed 6 years ago
I just wrote an issue on this as well #276
We already have URL/URI identifiers on track, we might hide them and only show specific identifiers, or incorporate them in some link element instead. Regarding the date, this is hopefully a minor bug initializing the date with today and not already wrong in the database. We will trace that down.
Displaying the Journal metadata in the Journal issue is more challenging, I assume that the data of the journal is indeed present. Yet the mapping from any "in collection" type to their collection type is not functional. We can do a lucky guess, which would solve it for Journal Issues specifically, but not in the general case.
Here is the mapping from child to parent resource types which we intend to use (provided by @kleinann)
switch (type) {
case resourceType.monograph || resourceType.editedBook || resourceType.book || resourceType.referenceBook:
return [resourceType.bookSet, resourceType.bookSeries];
case resourceType.bookSet:
return [resourceType.bookSeries]
case resourceType.bookChapter || resourceType.bookSection || resourceType.bookPart || resourceType.bookTrack || resourceType.component:
return [resourceType.editedBook, resourceType.book, resourceType.monograph]
case resourceType.proceedingsArticle:
return [resourceType.proceedings]
case resourceType.journalArticle:
return [resourceType.journalIssue, resourceType.journalVolume, resourceType.journal]
case resourceType.journalIssue:
return [resourceType.journalVolume, resourceType.journal]
case resourceType.journalVolume:
return [resourceType.journal]
case resourceType.report:
return [resourceType.reportSeries]
case resourceType.referenceEntry:
return [resourceType.referenceBook]
case resourceType.standard:
return [resourceType.standardSeries]
case resourceType.dataset:
return [resourceType.dataset]
default:
// return as default just book -- or []
return []
}
Until now we hesitated to implement it since we lacked a good heuristic. As you can see even for Journal Issues it is not clear, whether we need to show the metadata of Journal Volume or of the Journal altogether. Developing such a heuristic will be among the next steps.
For instance, we could traverse the list of valid container types left to right (from preferred to least preferred) and signal a hit
, as soon as the title property is present. Would be helpful to have your thoughts on this.
I'm not sure if I get the problem. I thought that in the data model discussion, we decided to implement only 2 levels of hierarchy; for journal articles, this would be journal article and journal issue; for collections, book-chapter and edited book. Logically, all the information that applies to hierarchical levels above the journal article or book chapter itself should be in journal issue or edited book. Only for the export to the OpenCitations Model, the data that we keep in our upper hierarchical level must be split up into several elements. Did I get this wrong, Anne? Or is there a problem that I'm missing here?
@kleinann it is a quite technical problem, but not easy to solve.
It is true that we have only the two levels of hierarchy, but on both level we have about 150 different properties (such as journalArticle_title
, journalIssue_title
, journal_title
). This raises a problem when the container meta-data should be displayed.
For example: we have a JOURNAL_ARTICLE
on the lower level and hence display the property subset that belongs to JOURNAL_ARTICLE
.
On the upper level, we have JOURNAL_ISSUE
, for which we display also the associated properties.
Now the problem is that these subsets of properties associated to JOURNAL_ISSUE
may be empty, since the correct metadata e.g. for JOURNAL_VOLUME
or JOURNAL
is located at different fields. That's the point where we now have to make a guess (see above).
We now traverse the hierarchy of properties upwards until we find a non-empty title
property. Currently, we only have 2 levels of resources, but the container resource still contains a hierarchy of properties, so to say, which we need to traverse until we find something... (just implemented)
I don't understand what you have to guess here, because we are speaking about our ingested data in our data model, which we fully control ourselves. As for practical implementation I think you have to add all information from every hierarchical level, i.e. the container resource in the frontend should contain the journal title from the JOURNAL
, the volume number form the JOURNAL_VOLUME
and the issue number form the JOURNAL_ISSUE
(maybe more).
Good point, we actually need such a concise list of relevant properties vor each Container Type. Maybe, we could have a dedicated Session regarding this mapping at some point?
I guess that we already have these properties implemented in the backend. @anlausch Is there a pointer to code where one can see the properties which are implemented for each container type?
There was once a Hackpad with a draft of the properties for the new data model that Anne shared with me. @anlausch - if you implemented it like that, this would maybe still do the job? Like Philipp, I think that it should be possible to do an exact mapping.
In the back end we implement more or less all properties for each type as we discussed the specific hierarchy only for journal_article (journal_issue, journal_volume, journal respectively). For other types you told me that a) the hierarchy is not clear b) no specific properties where mentioned. Having only parent and child resource does not mean that we do not need to curate the whole hierarchy. Most of the properties are anyways needed for everything, e.g. contributors, identifiers, title, embodiment etc. If you think that this is wrong, we can of course just delete those. The model is here: https://github.com/locdb/loc-db/blob/datamodel/api/schema/bibliographicResource.js . Let me know what you think.
Regarding wrong information I will of course check what's going on.
@anlausch - I couldn't find the "pages" (or firstpage / lastpage) property in journal article, book chapter, book section, book part, book track, component, proceedings article, dataset and reference entry. Is it missing, or did I just look at the wrong place?
The properties for the pages are part of the resource embodiment. Therefore they can be found in the type-specific embodiment. The definition looks like this:
const resourceEmbodimentSchema = new Schema({ // Resource Embodiment
identifiers: [identifiersSchema],
type: {type: String}, // digital or print
format: String, // IANA media type
firstPage: Number,
lastPage: Number,
url: String,
scans:[scanSchema]
});
They are primarily used for our self-ingested scans, right? Or do you also fill these values from external metadata sources?
I also try to fill those in case the data is available.
Regarding the wrong date: I checked the raw data again (hope I found the right entry) and there is no date given. Maybe the problem is then in the front end?
Will check in the front-end, it is most likely then initialized with "today", when no information is given. We will fix that.
Schedule:
I just uploaded a Sage journal article and the metadata is not what I expected: Shouldn't there be more information in the journal issue part?
The journal article has the information "In: Gender & Societ; 1, SAGE Publications; 2" But the journal is called Gender & SocietY, is there a limit on characters there? Then I am not sure what the 1 and the 2 stand for. The journal is in Volume 25, Issue 1. Is the 5 from 25 just cut?!
As it can be seen here https://locdb.bib.uni-mannheim.de/locdb-dev/bibliographicResources/5b0bd07093d5536341f88224 the journal issue has number and volume as expected. Additionally, it has two issn's for the journal. My guess would also be a character limit on the fields in the front end.
We fixed the remaining problem of some missing characters in some of the fields. Everything should be finished now, if not please raise a new, specific issue.
I add a new resource by DOI
10.1177/1468796807080237
and switched to the resolve tab. This new resource in the Agenda view (currently at the bottom) as a Journal Article and Journal Issue:But the information about the journal name, volume number, issue number is missing. Moreover, the journal issue has the date of today which is clearly wrong. The authors and publisher information is missing as well. BTW the urls shown here seem for me unnecessary, no?