OpenAPC / openapc-de

Collect and disseminate information on fee-based Open Access publishing
https://treemaps.openapc.net/
121 stars 118 forks source link

How to reference copies in institutional repositories or CRIS systems #55

Closed njahn82 closed 8 years ago

njahn82 commented 9 years ago

Many institutions register Open Access journal articles also in their local repositories or CRIS systems. Often, these metadata records contain useful information someone wants to reuse.

The question is how to reference these sources effectively within our dataset. So far, we record the base.url and the repo.id. As @tullney has suggested to me, OAI identifier or the link to the landing page could be used instead. This would increase the discoverability and reuse of the institutional records through the dataset.

Any comments or suggestions?

ioverka commented 9 years ago

If we would want to collect metadata for the journal articles listed in openAPC, we probably would try to request the md records from CrossRef first - simply because the DOI seems to be available for most records in openAPC. I'm not a fan of storing the repository's base.url or any links to specific landing pages in the openAPC dataset (who is supposed to update this in 5 years?). But the OAI identifier could be helpful - as long as something like the OAI data provider registry is around and up-to-date.

tullney commented 9 years ago

I can't say how useful the metadata stored in local systems is, and getting additional metadata via the DOI seems to be the right (first) thing to do. There is another benefit of having a link to the institutional repository: Documentation that the open access article has been archived by that institution, something that I'd like institutions to do. Additional metadata and a link to another copy should be reason enough to at least enable putting that information in the files. It allows us to do some basic analyses around "Do funded articles go into local repository systems?".

joschirr commented 9 years ago

The OAI identifier is not stable (it's local to the repository instance, may change due to platform migration). Hence storage of PIDs (URN, DOI, ...) is helpful. It would be a nice feature of academic search engines to provide metadata lookup via and across different PIDs.

ioverka commented 9 years ago

@tullney In our instance there could be a record in the repository without a full text attached to it, so it would be difficult to derive the archiving status directly from the provision of a link. In addition, raising the information ("Has the article been uploaded?") would require additional resources - due to the fact that the IR is filled by a completely different group of people in our organization. I'm not sure if we would invest in that.

@joschirr You proved my premise to be wrong, so I recall the conclusion ;)

tullney commented 9 years ago
  1. We could also think about just dropping the two columns from the csv sample to close the issue; anyone interested in locally archived versions, related publications, etc. could query other systems and use the DOI from the OpenAPC database.
  2. I wouldn't think of any kind of repository/CRIS column as mandatory. I still think it could be useful to provide some guidance for those who want to add this information. This could also be someone else, doesn't have to be the team/institution that provided the payment information.
  3. If the repository uses new identifiers for e.g. post-print copies (DOI/URN/Handle), this could be useful to have, too. Doesn't change a lot regarding increased workload, making things more complex, etc. compared to OAI-Identifier, though.
tullney commented 8 years ago

base.url and repo.id will be removed from apc_de.csv.