artshumrc / giza

JSON API (for TMS Database) and Django 2 application for Digital Giza
http://giza.fas.harvard.edu/
7 stars 5 forks source link

Giza digital library #57

Closed pmanuelian closed 7 years ago

pmanuelian commented 7 years ago

In the Giza Library, Selim Hassan, Giza 2 shows up for me, but gets skipped (along with Giza volumes after 9) on a colleague's browser in Cairo. We tried different tricks (cache refresh, different browsers, etc.), but no solution.

rsinghal commented 7 years ago

That is an odd error. Can you get a screenshot? It could be an issue with your colleague's network connection. Are there any other missing documents?

pmanuelian commented 7 years ago

Sure, here are 2 screen shots, one from me showing everything in place, the other from Cairo, showing Vol. 2 missing (and her screen couldn’t snap everything, but apparently vol. 9 and beyond were missing too.

In Cairo, my colleague said:

Refreshed several times: no change (that was in IE) Chrome: no #2 Firefox: also no #2

Thanks,

Peter

-- Peter Der Manuelian Philip J. King Professor of Egyptology Director, Harvard Semitic Museum Harvard University 6 Divinity Avenue Cambridge, MA 02138 peter_manuelian@harvard.edumailto:peter_manuelian@harvard.edu 617-496-8558 http://giza.fas.harvard.edu [cid:84BD180D-26D5-42AD-840C-1B822F988246]

[cid:159BE137-7202-46C1-9943-FB7CA33D818D]

On Jun 6, 2017, at 8:58 AM, Rashmi Singhal notifications@github.com<mailto:notifications@github.com> wrote:

That is an odd error. Can you get a screenshot? It could be an issue with your colleague's network connection. Are there any other missing documents?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_rsinghal_giza_issues_57-23issuecomment-2D306478143&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=liVHHOd0F-bqJfdLgU-MF_cDiqNKtuV39JjGTgIvJDs&m=0E2HNaMVvFF2Zuulb1PoDBeK9I0g_B0ncydPKcD3v2g&s=vE0CX8tKiJkbm4W5egyCgxY5BTmnIWNxxcPdUcFaIGE&e=, or mute the threadhttps://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AXR-2DGvz6ozqq3TtN1abXIm9ZrPa8-2DTGrks5sBUz9gaJpZM4Nwpa5&d=DwMCaQ&c=WO-RGvefibhHBZq3fL85hQ&r=liVHHOd0F-bqJfdLgU-MF_cDiqNKtuV39JjGTgIvJDs&m=0E2HNaMVvFF2Zuulb1PoDBeK9I0g_B0ncydPKcD3v2g&s=cfiAJsUQvHDGtkXkWtl8pGcwKZPWinopRYbWJYslOVc&e=.

jkisala commented 7 years ago

Jumping in here: I haven't noticed anything missing for me (though I haven't done a comprehensive check), but some things are displaying oddly, for instance there are about 20 things listed under "No Author" that definitely have authors. library page

jkisala commented 7 years ago

We've been looking at this in further depth, and there are a few more things:

  1. There appear to be hundreds of items missing altogether - by my count, there are 516 unique items listed on the Library page, out of 761 Bibliography records that are set to public access. Among the missing items are Lepsius, Carl Richard. Denkmäler aus Aegypten und Aethiopien. Plates 2, Band 3; six articles by Ludwig Borchardt from Zeitschrift fur Aegyptische Sprache und Altertumskunde; and who knows what else, since unless volumes are numbered it's hard to tell without checking every record individually. (For whatever reason, these do turn up in searches; just not in the full list.)
  2. The first entry under "Unsigned" is all messed up - for some reason it lists Reisner as the author, even though he's not listed anywhere on the TMS record for that PDF; and the rest of the citation is duplicated.
  3. It seems like citations are generally listed in alphabetical/alphanumeric order, but Junker's publications are all out of order - Giza 2 comes after "Z", for instance.
rsinghal commented 7 years ago
  1. I will look into the missing items.
  2. I should just be pulling the data out of TMS as it is, but I'll double check this one.
  3. This has to do with the way the bibliography is written in TMS: Junker, Hermann. <i>Zu vs Junker, Hermann. Gîza 2. <i>Die. < is before G in the ASCII table: http://www.asciitable.com/. Does TMS have a sorted field for these bibliographies?
rsinghal commented 7 years ago

A few notes on the various comments here:

  1. I don't see any screenshots from @pmanuelian. I think you will need to upload directly in github.
  2. Regarding the "No Author" section: the majority of those that do actually have an author don't have a Constituent Type set for the author. Without that, I can't categorize them (modernpeople, ancientpeople, etc). They will need to be fixed in TMS for the website to reflect their categorization.
  3. I am reprocessing the data, so some things, like the Unsigned error, will be fixed.
  4. With regard to Ludwig Borchardt articles not showing up: these don't have PDFs, so I prevent them from displaying in the Digital Library section because my understanding was that the point of that page was to only offer documents that have downloadable links.
jkisala commented 7 years ago

[Looping in @raronin and @npicardo, so we all know what's going on...]

  1. Here are the screenshots that Peter sent before - the first is from our colleague (with volume 2 missing); the other is Peter's with everything in place. hassan_giza

    screen shot 2017-06-03 at 4 35 25 pm
  2. I just went through all of these in TMS, and now I'm just confused - all but one of them (O'Connor, David, and David Silverman. "The University Museum in Egypt." Expedition 21 no. 2 (Winter 1979), pp. 4-63) do have related constituents with the role "Author." So I'm not sure what's going on there?

  3. Great, thanks!

  4. You're absolutely right that things without PDFs shouldn't be listed in the Digital Library, but again, I just checked in TMS and those Borchardt ZAS articles do have PDFs attached. I double checked that everything is public access and everything, and I can't find anything wrong on that end.

rsinghal commented 7 years ago
  1. It's not that the published documents don't have a related constituent with Author. The issue is that the constituent itself is not categorized as a Modern Person, so there is no way for me to create a record in the API for that constituent. For example, a search for miroslav http://giza.fas.harvard.edu/search-results/?q=miroslav&category=modernpeople, does not return Miroslav Bárta.

  2. I will need a link to the pubdoc page for one of those Borchardt articles in order to understand what is missing.

jkisala commented 7 years ago
  1. Ah, gotcha. I've never come across a constituent record where we hadn't entered a type before, so that didn't even occur to me! I'll go through and fix those tomorrow, thanks.

  2. So here's the Bibliography record for one of them: http://giza.fas.harvard.edu/pubdocs/678/intro/. The PDF isn't showing up there, either, even though it is here gizamedia.rc.fas.harvard.edu/documents/borchardt_zas_35_2.pdf and everything is linked in TMS as far as I can tell.

rsinghal commented 7 years ago

Thanks for the Bib record. At least I'm being consistent within the API :) But I will look into why the PDF isn't showing up for that record. If I fix it there, it will fix it in the Digital Library.

jkisala commented 7 years ago

I've just gone through and assigned types to those constituents, so they should be all set now!

Going back to the issue of getting things to be properly alphabetical on the library page, if we go through and standardize the "Title" field to be just the title, with no formatting or special characters or anything, would that enable you to code it in to sort by that? (Just want to make sure it wouldn't be wasted effort, before we start overhauling 1000 records.)

Thanks!

rsinghal commented 7 years ago

Yes, that should be fine, as long as they only have letters or numbers. 10 will still come before 2, and Z will come before a, per the ASCII table. But, cleaner data will possibly make it easier to find a solution to make 2 come before 10.

jkisala commented 7 years ago

Okay, great. Do you know if the Title field actually displays anywhere on the website? I don't think it does, since we generally pull from the Citation field instead. If that's correct, and that field won't be visible to people, we can probably enter the data in a less-accurate-to-the-human-mind but more computer-friendly way - listing volumes as "01" rather than "1" should fix the sorting issue, no? And we could remove umlauts and special characters so that "A" and "Ä" would end up together. (Titles should always start with a capital letter anyway, so at least Z being before a won't be an issue.)

rsinghal commented 7 years ago

I use it as the display text (e.g. http://giza.fas.harvard.edu/ancientpeople/756/full/#published, and then also as the top level title on individual pudocs pages). I don't have to use that, but the Citation field felt too long for these areas and there didn't seem to be a better option. You are also free to use a field that I would hide from display but could be used for sorting (even if TMS doesn't offer a true sorted field).

jkisala commented 7 years ago

That makes sense, so let's not mess with that. Does it matter at all what field we designate for this (since I'm not seeing anything sorting-related)? If not, I'd probably go for the "Notes" field, which I don't think we've ever really used.

rsinghal commented 7 years ago

That should be fine.

jkisala commented 7 years ago

I've just finished copying the titles over into the notes field, so that's ready to be drawn on at any point. I expect it'll probably need some refinement along the way once we see how things display, but at least it's a lot cleaner than it was!

rsinghal commented 7 years ago

I can't recreate the missing volumes - I'm hoping it was just a data issue. I did a data refresh, and when I look at prod and dev now, everything seems to be showing up. Also added in the sorting by notes section, so closing this ticket out. We can re-open it or create a new one if the missing volumes comes up again.

jkisala commented 7 years ago

Thanks, Rashmi. I haven't done a comprehensive check, but at least the handful of things that I remembered being missing before are there now. And the sorting looks great.