DCLP / dclpxsltbox

Sandbox for development, testing, and review of XSLT for DCLP
http://dclp.github.io/dclpxsltbox/
1 stars 5 forks source link

overlapping texts in DCLP and DDbDP #306

Open paregorios opened 7 years ago

paregorios commented 7 years ago

From email from @rla2118:

In the case of DCLP/65/64564 there may be a conflict with the Duke data, http://www.papyri.info/ddbdp/sb;26;16458 . The Duke text (without Parma’s intro, comm., and extended app. crit.) is what appears in DCLP at http://www.litpap.info/dclp/64564. I would prefer to see the Parma text and metadata there.

I didn’t check if other files on Carmen’s list are competing with DDbDP texts.

@hermaion82 then followed up with this observation:

Rodney, I am afraid that this overlapping may happen at least also for TMs 63940, 64242, 26899, which are all prescriptions or lists and I believe they appear in the DDB as documentary texts. I do not have recorded any further conflict of this kind, but I may be wrong.

So a question for @hcayless and/or @ryanfb : do you have an immediate sense as to what aspect of the mapping or XSLT processes might be causing such omissions?

paregorios commented 7 years ago

Further commentary in another email from @hermaion82 👍

I discovered that the phenomenon of overlapping with Papyri.info is more widespread than I suspected earlier. This might concern @rla2118 and @jcowey : in the spreadsheet, I indicated in column K ("Final Issues") such cases, with the label "DDBDP overlapping" in bold red. For each of them, DCLP displays the DDB text, while the Parma edition of the text should have been submitted to the DCLP GitHub. They are almost all prescriptions, as is understandable.

@hermaion82 can you provide a direct link to the spreadsheet in question? The "simplified link" in your email doesn't go directly to a spreadsheet.

paregorios commented 7 years ago

cc @jds15 @rogerbagnall

paregorios commented 7 years ago

I assume that what we want to see on a given HTML page in a DDBDP+DCLP combined papyri.info is every text available from every collection, plus every aspect of metadata from every collection. So, this issue is related to #270.

If I'm wrong about this assumption, someone should say so.

rogerbagnall commented 7 years ago

I agree with that assumption. It is, of course, possible that in a case like two different texts of a prescription, the DDbDP editorial board might wish to suppress its text in favor of that in DCLP if the latter is superior, but that will take time and probably not be a high priority. So displaying all data and metadata from both seems essential.

rla2118 commented 7 years ago

I spoke to James about this and we'd be inclined to suppressing the DDbDP text, to avoid multiple instances of single texts, which will skew search results. This is a problem in PHI.

paregorios commented 7 years ago

@rla2118 and @jcowey please elaborate. Do you imagine that the DDBDP text will be suppressed automagically (i.e., by code), or by editorial decision and implementation on a case-by-case basis as @rogerbagnall describes above?

cc @jds15

rla2118 commented 7 years ago

I'd say on a case-by-case basis.

ryanfb commented 7 years ago

There may be two separate places where this needs to be accounted for then, in both display as well as search/indexing.