Closed dstillman closed 2 years ago
Huh, surprised we don't have a ticket.
Preprint should map to CSL article.
Preprint server (not wedded to that label) should be publisher
.
I think we'll want series and series number to accommodate working papers in series. Beyond that, only standard fields.
Edit: just looking at arXiv and wondering if we should try to get the ID into a number field? It needs to be citeable
How about "repository" for publisher?
Type mapped to genre.
APA style wants the archive ID. We settled on CSL archive_location for that back when we discussed it @adam3smith when I was writing APA 7.
APA style wants the archive ID. We settled on CSL archive_location for that back when we discussed it @adam3smith when I was writing APA 7.
Do you remember why? number
as used e.g. for patent, seems a better fit. I'm just a bit worried that we have a fair amount of styles citing archive and.location across all item types
Let me look into it
The ids actually get a little tricky. We currently put arXiv IDs (from arXiv.org or Mendeley import) into Extra as arXiv
(which maybe should've been arXiv ID
), and I assumed we'd want to migrate that to a dedicated field, which later might be part of a more flexible many-to-one id system like we've talked about in the past. But then we'd probably need special logic everywhere to get that to the processor as number
or whatever it needs to be — a regular CSL mapping wouldn't work because an import back from CSL-JSON would be ambiguous, with multiple possible fields (number
, arXivID
, or any other repo-specific ones).
Can we just assume that all preprint archives will use an unambiguous id format, with an identifiable prefix like arXiv:
, and we can just store them in a single archiveID
field, mapped bidirectionally to an appropriate CSL field? And any automated handling will just use the prefix to identify it?
I like the archiveID. Not sure if all servers have that - e.g. OSF preprints technically habe an ID but they never use it, but leaving the field empty is fine of course. Where IDs are essential, I think assuming a prefix and unique ID is plausible
Would we maybe want to add archive ID to all types alongside archive, location in archive, and the new archive place and archival collection? That would unambiguously separate physical and digital locations. CSL could add an archive_id variable
I like it, I think, particularly the electronic vs. physical but we should maybe run by some more people?
Sent from my phone
On Fri, Nov 12, 2021, 08:08 Brenton M. Wiernik @.***> wrote:
Would we maybe want to add archive ID to all types alongside archive, location in archive, and the new archive place and archival collection? That would unambiguously separate physical and digital locations. CSL could add an archive_id variable
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/zotero/zotero-bits/issues/88#issuecomment-967105984, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA7PWXRBXSIAAAKORCVVRLULUGTLANCNFSM5H4NQJZQ .
Does document
stay mapped to article
too? Should CSL-JSON article
import to document
or preprint
?
Re: archiveID
on everything, a concrete example that I've been unsure about: if you have a preprint with an arXiv ID, and then you update metadata and it now is published and has a DOI, we presumably convert that item to journalArticle
. Do we keep the archiveID
on the item? Throwing it out seems bad, but it also seems a little conceptually fuzzy, since the item no longer really represents that version. arXiv.org obviously keeps the page and lists the DOI, but the canonical source of metadata would be the publisher, and that metadata wouldn't have the arXiv ID.
More practically, do styles know not to use the archiveID
for published articles?
> Does document stay mapped to article too? Should CSL-JSON article import to document or preprint
CSL 1.0.2 which we are hoping to release on Dec 1 has document
so preprint should map to article and document to document
I'm not sure about the answer to the arxiv questions, but as a data point, arxiv's own bibtex no longer includes the arxiv ID once an item is published in a journal
Perhaps converting archiveID to an attached link would be a good way to keep the information but also avoid including the ID in citations to published items?
That's a good idea.
But then do we still need archiveID
on all item types?
A lot of items might have an electronic archive that should be cited instead of/in addition to a URL. APA for example, wants archive and archive IDs to be included when the item is not widely available (e.g., articles, reports, manuscripts, books, documents). Examples given in the manual are ProQuest ID numbers and ERIC ID numbers.
9.30 Database and Archive Sources Database and archive information is seldom needed in reference list entries. The purpose of a reference list entry is to provide readers with the details they will need to perform a search themselves if necessary, not to replicate the path the author of the work personally used. Most periodical and book content is available through a variety of databases or platforms, and different readers will have different methods or points of access. Additionally, URLs from databases or library-provided services usually require a login and/or are session specific, meaning they will not be accessible to most readers and are not suitable to include in a reference list.
Provide database or other online archive information in a reference only when it is necessary for readers to retrieve the cited work from that exact database or archive.
- Provide the name of the database or archive when it publishes original, proprietary works available only in that database or archive (e.g., Cochrane Database of Systematic Reviews or UpToDate; see Chapter 10, Examples 13–14). References for these works are similar to journal article references; the name of the database or archive is written in italic title case in the source element, the same as a periodical title.
Provide the name of the database or archive for works of limited circulation, such as
- dissertations and theses published in ProQuest Dissertations and Theses Global,
- works in a university archive,
- manuscripts posted in a preprint archive like PsyArXiv (see Chapter 10, Example 73),
- works posted in an institutional or government repository, and
- monographs published in ERIC or primary sources published in JSTOR (see Chapter 10, Example 74).
These references are similar to report references; the name of the database or archive is provided in the source element (in title case without italics), the same as a publisher name.
Do not include database information for works obtained from most academic research databases or platforms because works in these resources are widely available. Examples of academic research databases and platforms include APA PsycNET, PsycINFO, Academic Search Complete, CINAHL, Ebook Central, EBSCOhost, Google Scholar, JSTOR (excluding its primary sources collection because these are works of limited distribution), MEDLINE, Nexis Uni, Ovid, ProQuest (excluding its dissertations and theses databases, because dissertations and theses are works of limited circulation), PubMed Central (excluding authors’ final peer-reviewed manuscripts because these are works of limited circulation), ScienceDirect, Scopus, and Web of Science. When citing a work from one of these databases or platforms, do not include the database or platform name in the reference list entry unless the work falls under one of the exceptions.
If you are in doubt as to whether to include database information in a reference, refer to the template for the reference type in question (see Chapter 10).
Finish the database or archive component of the source element with a period, followed by a DOI or URL as applicable (see Sections 9.34–9.36).
OK, so use the same archiveID
field for preprint
and journalArticle
/others, but move known preprint-server ids to attached links on metadata updating, and translators/people can populate the non-preprint
archiveID
fields as needed.
The only problem would be if you manually changed the item type from Preprint to Journal Article. If it's the same field, the archiveID
value would be preserved and potentially affect citations, which would be different behavior from metadata updating. Or we could override the default behavior and convert to an attached link at that point, to make it the same as during metadata updating, but we wouldn't do that going in the other direction, so it's a little weird.
This is what I have so far:
{
"itemType": "preprint",
"fields": [
{
"field": "title"
},
{
"field": "abstractNote"
},
{
"field": "date"
},
{
"field": "repository",
"baseField": "publisher"
},
{
"field": "place"
},
{
"field": "archiveID"
},
{
"field": "DOI"
},
{
"field": "citationKey"
},
{
"field": "url"
},
{
"field": "accessDate"
},
{
"field": "archive"
},
{
"field": "archiveLocation"
},
{
"field": "shortTitle"
},
{
"field": "language"
},
{
"field": "libraryCatalog"
},
{
"field": "callNumber"
},
{
"field": "rights"
},
{
"field": "extra"
}
],
"creatorTypes": [
{
"creatorType": "author",
"primary": true
},
{
"creatorType": "contributor"
},
{
"creatorType": "editor"
},
{
"creatorType": "translator"
},
{
"creatorType": "reviewedAuthor"
}
]
}
Some more questions:
publisher-place
)?type
(mapped to genre
) be for, if journal articles (which many/most of these will become) don't have that.publisher
) and "Archive ID" next to it, when there are existing "Archive" and "Loc. in Archive" fields down below. And I'm a bit confused about how "Archive ID" interacts with "Archive" on other types. Would "Archive" be used for digital archives as well, and you use either "Archive ID" or "Loc. in Archive" depending on electronic vs. physical? But we can't use "Archive" here because we need it to map to publisher
?archiveID
to number
for now?The preprint type will encompass things like Working papers (eg, in economics) which are sometimes cited with a place, so I think yes
genre
would hold descriptions like "Working paper". It can be dropped if converted to a journal article
That's correct. It's a little funky I agree. In most cases archiveID
would pair up with the other Archive variables. Preprints are an unusual case where the archive and the publisher are the same thing.
Hmm, I think so. One concern might be if many styles are written to render number
indiscriminately.
@adam3smith Would number
generally work as the electronic archive ID, or might items, eg, in ERIC or ProQuest have both, such as a working paper series number and archive ID?
@denismaier @bdarcus What do you think of adding an archive_id
variable to CSL to distinguish between physical locations (archive_location
) and electronic ones (archive_id
)?
Agree with Brenton on the above. I think we'll do fine with number
- if we want series numbers, we'll use collection-number
Edit: which does mean we'll want series and series number added to the above
Anyone have an idea for an icon for preprints?
We'll need both a custom one in the new style for iOS/web and something based on famfamfam or Fugue for the desktop client:
http://www.famfamfam.com/lab/icons/silk/previews/index_abc.png https://p.yusukekamiyamane.com/icons/preview/fugue.png
(Could be a combination of icons if necessary.)
"script" is sort of funny for this, in a Martin-Luther-nailing-theses-to-the-door sort of way. We're using that for Bill in the client, but our custom icon for Bill is the § symbol, so we could repurpose the script concept for this.
For now, I'm going with "receipt", which doesn't make a ton of sense but looks vaguely unfinished — like a piece of paper ripped off a dot matrix printer.
What about famfamfam's page_white_gear
or page_white_go
? Or "receipt" but converted to grayscale to match other print-ish types. Something about the blue just feels off to me.
"receipt" is the top row above. "bill" is the second. I was just saying we could use the bill concept, but we'd definitely do it in white/gray to be closer to the journal article icon.
Oh, right.
Maybe it's just been too many years of seeing the scroll/script used for Bill, but it looks a little weird to me for preprint
For famfamfam, I think both page_white_lightning
and page_white_go
are interesting and emphasize the rapidity of preprints.
From Fugue, I really like report
or report-share
. The notebook fringes on the left side of the page feel like a draft or unfinished paper (like receipt but better). The version with the sharing hand emphasizes the sharing/feedback solicitation of preprints/working papers.
How about page_white_wrench, because they're (often) still being worked on?
You could also pick your four favorites options and make it a Twitter poll, create some preprint buzz
Trying out the client on macOS with the new Preprint type. I think the current receipt icon is visually too similar to the Journal Article icon. On the macOS color scheme, I can barely see the fringes at the top and bottom, so the Journal Article and Preprint items look really similar.
Yes, we'll be changing it. Priority was just getting this out.
Cool, just wanted to give some feedback in case that wasn't the plan
Starting to work on preprint citations -- I'm not getting Archive ID
mapped to CSL number
(testing in the style editor in6.0.8-beta.4+1e3959020 ) -- could someone else check whether that's me or a general issue?
Can you provide a sample minimal style to test that?
MWE:
https://gist.github.com/adam3smith/786485597971865e2a99687f5401841d
Displays patentNumber for patent but [CSL STYLE ERROR: reference with no printed form.]
for Preprint
ArchiveID also doesn't show up in CSL JSON from preprints, but I think that's expected? FWIW, I'm testing with https://www.nber.org/papers/w14560 as imported using the NBER translator.
Sorry about that — didn't update a submodule. Try in the latest beta.
Yup, working, thank you!
@adam3smith, @bwiernik, is there anything I should be consulting for this? Anything this needs to be mapped to on the CSL side? I'm not seeing any existing issues for it.