caltechlibrary / caltechauthors

The CaltechAUTHORS InvenioRDM source code
https://authors.library.caltech.edu
Other
3 stars 1 forks source link

Review NginX logs for people calling Eprints API for website content integration #117

Open rsdoiel opened 3 months ago

rsdoiel commented 3 months ago

Review the NginX logs for RDM authors to find the EPrint references that re being called. The logged URL will be for something like https://authors.library.caltech.edu/cgi/jsview. Here's an example of the code people are using in campus webpages. We need to report to figure out who to reach out too to correct this on their end (e.g. they should be using RDM's JSON API or feeds).

<script type="text/javascript" src="https://authors.library.caltech.edu/cgi/jsview?view=group&id=Caltech_Center_for_Environmental_Microbial_Interactions_=28CEMI=29"></script>
rsdoiel commented 2 months ago

I grepped /var/log/nginz/access.log* looking for the string cgi/jsview. I found the following The following GET requests look to be the old EPrint include API .

GET /cgi/jsview HTTP/1.1
GET /cgi/jsview?view=group&id=Caltech_Center_for_Environmental_Microbial_Interactions_=28CEMI=29 HTTP/1.1
GET /cgi/jsview?view=group&id=Keck_Institute_for_Space_Studies HTTP/1.1
GET /cgi/jsview?view=group&id=TAPIR HTTP/1.1
GET /cgi/jsview?view=person-az&id=Adhikari-R-X HTTP/1.1
GET /cgi/jsview?view=person-az&id=Beck-J-L HTTP/1.1
GET /cgi/jsview?view=person-az&id=Faber-K-T.creators_name HTTP/1.1
GET /cgi/jsview?view=person-az&id=Faber-K-T.type HTTP/1.1
GET /cgi/jsview?view=person-az&id=Haile-S-M HTTP/1.1
GET /cgi/jsview?view=person-az&id=Hoffmann-M-R HTTP/1.1
GET /cgi/jsview?view=person-az&id=Pellegrino-S HTTP/1.1
GET /cgi/jsview?view=person-az&id=Ravichandran-G HTTP/1.0
GET /cgi/jsview?view=person-az&id=Yariv-A HTTP/1.1

The hosts associated with those requests are likely

http://faber.caltech.edu/
http://hoffmann.caltech.edu/
http://jimbeck.caltech.edu/
http://lisa.caltech.edu/
http://www.cco.caltech.edu/
http://www.hoffmann.caltech.edu/
http://www.its.caltech.edu/
http://www.jimbeck.caltech.edu/
http://www.pellegrino.caltech.edu/
http://www.tapir.caltech.edu/
http://www.tapir.caltech.edu/pubs/
rsdoiel commented 1 month ago

Who do we reach out too for these sites? Are their appropriate liaison librarians to work with?

tmorrell commented 1 month ago

Could we get the full urls for the its and cco domains?

tmorrell commented 1 month ago

http://faber.caltech.edu/ fixes are dependent on https://github.com/caltechlibrary/feeds.library.caltech.edu/issues/113 http://jimbeck.caltech.edu/ dependent on https://github.com/caltechlibrary/feeds.library.caltech.edu/issues/115 and https://github.com/caltechlibrary/caltechauthors/issues/124

rsdoiel commented 1 month ago

Here's what I ran this morning and followed by the results. I filtered the log for "cfg/jsview".

Command run on authors.library.caltech.edu

sudo cat /var/log/nginx/access.log | grep 'cgi/jsview' | cut -d \" -f 2,4 | tr \" ,```

Results (the GET path requested and the referrer host):

GET /cgi/jsview?view=person-az&id=Yariv-A HTTP/1.1,http://www.its.caltech.edu/ GET /cgi/jsview?view=person-az&id=Yariv-A HTTP/1.1,http://www.its.caltech.edu/ GET /cgi/jsview?view=person-az&id=Yariv-A HTTP/1.1,- GET /cgi/jsview?view=person-az&id=Hoffmann-M-R HTTP/1.1,http://www.hoffmann.caltech.edu/ GET /cgi/jsview?view=person-az&id=Hoffmann-M-R HTTP/1.1,http://www.hoffmann.caltech.edu/ GET /cgi/jsview?view=person-az&id=Haile-S-M HTTP/1.1,- GET /cgi/jsview?view=person-az&id=Haile-S-M HTTP/1.1,- GET /cgi/jsview?view=person-az&id=Hoffmann-M-R HTTP/1.1,http://hoffmann.caltech.edu/ GET /cgi/jsview?view=person-az&id=Hoffmann-M-R HTTP/1.1,http://hoffmann.caltech.edu/



The logs don't how an explicit path in the referrer URL so the hostname URL is the best we can do.  But looking at the request path you do have an Author ID we can work with.
tmorrell commented 1 month ago

Haile is no longer at Caltech, Kathy will contact Yariv