Closed ctgraham closed 3 years ago
The OCLC FirstSearch parser sets result.login: https://github.com/ezpaarse-project/ezpaarse-platforms/blob/1b1b4f018fe55b793de6127620d75f652113eb5f/oclc-fs/parser.js#L30 https://github.com/ezpaarse-project/ezpaarse-platforms/blob/1b1b4f018fe55b793de6127620d75f652113eb5f/oclc-fs/parser.js#L144
result.login
No other parser was observed to do this.
The effect of the assignment of result.login is that the actual login of the user, if present in the log, is overwritten.
Consider test.log as:
test.log
192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:42 -0400] "GET https://pitt.idm.oclc.org:8443/connect?session=swZOoMiUxUCi1pBa&qurl=http%3a%2f%2ffirstsearch.oclc.org%2ffsip%3f%26dbname%3dWorldCat%26done%3dreferer HTTP/1.1" 302 0 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:42 -0400] "GET http://firstsearch.oclc.org:80/fsip?&dbname=WorldCat&done=referer HTTP/1.1" 301 0 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:43 -0400] "GET https://firstsearch.oclc.org:443/fsip?&dbname=WorldCat&done=referer HTTP/1.1" 302 611 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:43 -0400] "GET https://firstsearch.oclc.org:443/html/webscript.html:%3Asessionid=fsapp1-42029-k8dgm19o-rahj6v:sessionid=fsapp1-42029-k8dgm19o-rahj6v: HTTP/1.1" 200 27944 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:43 -0400] "GET https://firstsearch.oclc.org:443/html/print.css HTTP/1.1" 200 217 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:43 -0400] "GET https://firstsearch.oclc.org:443/css/common.css HTTP/1.1" 200 1215 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:43 -0400] "GET https://firstsearch.oclc.org:443/javascript/misc.js HTTP/1.1" 200 1030 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:43 -0400] "GET https://firstsearch.oclc.org:443/javascript/calendar.js HTTP/1.1" 200 30447 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:43 -0400] "GET https://fonts.googleapis.com:443/css?family=Roboto HTTP/1.1" 200 2962 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:43 -0400] "GET https://firstsearch.oclc.org:443/images/fs2x2.gif HTTP/1.1" 200 187 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:43 -0400] "GET https://firstsearch.oclc.org:443/images/fs_info.gif HTTP/1.1" 200 189 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:43 -0400] "GET https://firstsearch.oclc.org:443/WebZ/FSPrefs?entityjsdetect=:javascript=true:screensize=large:sessionid=fsapp1-42029-k8dgm19o-rahj6v:entitypagenum=1:0 HTTP/1.1" 200 38920 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:44 -0400] "GET https://firstsearch.oclc.org:443/images/fs2x2.gif HTTP/1.1" 200 230 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:44 -0400] "GET https://firstsearch.oclc.org:443/images/fs_info.gif HTTP/1.1" 200 1535 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:44 -0400] "GET https://firstsearch.oclc.org:443/images/nfs_news.gif HTTP/1.1" 200 495 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:44 -0400] "GET https://firstsearch.oclc.org:443/images/ar16x40.gif HTTP/1.1" 200 1094 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:44 -0400] "GET https://firstsearch.oclc.org:443/images/fs_helpsmall.gif HTTP/1.1" 200 2231 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:44 -0400] "GET https://firstsearch.oclc.org:443/images/nfs_help.gif HTTP/1.1" 200 2231 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:44 -0400] "GET https://firstsearch.oclc.org:443/images/fs_infosmall.gif HTTP/1.1" 200 1535 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:44 -0400] "GET https://firstsearch.oclc.org:443/images/worldcat_72x22.gif HTTP/1.1" 200 1740 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:44 -0400] "GET https://firstsearch.oclc.org:443/images/ja16x45.gif HTTP/1.1" 200 1129 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:44 -0400] "GET https://firstsearch.oclc.org:443/images/ko16x47.gif HTTP/1.1" 200 361 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:44 -0400] "GET https://firstsearch.oclc.org:443/images/zh16x79.gif HTTP/1.1" 200 1208 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:44 -0400] "GET https://firstsearch.oclc.org:443/images/zs16x79.gif HTTP/1.1" 200 1199 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:44 -0400] "GET https://firstsearch.oclc.org:443/images/oclc_logo67x36.gif HTTP/1.1" 200 1788 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:44 -0400] "GET https://firstsearch.oclc.org:443/favicon.ico HTTP/1.1" 404 332 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:53 -0400] "POST https://firstsearch.oclc.org:443/WebZ/FSQUERY?format=BI:next=html/records.html:bad=html/records.html:numrecs=10:sessionid=fsapp1-42029-k8dgm19o-rahj6v:entitypagenum=2:0:searchtype=basic HTTP/1.1" 200 66621 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:54 -0400] "GET https://firstsearch.oclc.org:443/images/fs_sort.gif HTTP/1.1" 200 817 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:54 -0400] "GET https://firstsearch.oclc.org:443/images/fs_relatedsubjects.gif HTTP/1.1" 200 694 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:54 -0400] "GET https://firstsearch.oclc.org:443/images/fs_relatedauthors.gif HTTP/1.1" 200 1931 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:54 -0400] "GET https://firstsearch.oclc.org:443/images/fs_narrow.gif HTTP/1.1" 200 891 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:54 -0400] "GET https://firstsearch.oclc.org:443/images/nfs_email.gif HTTP/1.1" 200 1378 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:54 -0400] "GET https://firstsearch.oclc.org:443/images/fs_print.gif HTTP/1.1" 200 954 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:54 -0400] "GET https://firstsearch.oclc.org:443/images/fs_export.gif HTTP/1.1" 200 1194 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:54 -0400] "GET https://firstsearch.oclc.org:443/images/nfs_prev.gif HTTP/1.1" 200 1112 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:54 -0400] "GET https://firstsearch.oclc.org:443/images/nfs_next.gif HTTP/1.1" 200 1094 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:54 -0400] "GET https://firstsearch.oclc.org:443/images/icon-bks24.gif HTTP/1.1" 200 1673 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:54 -0400] "GET https://firstsearch.oclc.org:443/images/fs_getit.gif HTTP/1.1" 200 353 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:54 -0400] "GET https://firstsearch.oclc.org:443/images/fs_libowns.gif HTTP/1.1" 200 857 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:54 -0400] "GET https://firstsearch.oclc.org:443/images/icon-url24.gif HTTP/1.1" 200 1591 192.168.0.1 USERNAME@pitt.edu wZOoMiUxUCi1pBa [29/Mar/2020:15:51:54 -0400] "GET https://firstsearch.oclc.org:443/images/icon-com.gif HTTP/1.1" 200 985
processed via:
curl -s -X POST --no-buffer -H 'Reject-Files: all' -H 'Crypted-Fields: none' -H 'Log-Format-ezproxy: %h %u %{ezproxy-session}i %t "%r" %s %b' -H 'Date-Format: DD/MMM/YYYY:HH:mm:ss Z' -H 'Connection: keep-alive' --data-binary @/tmp/test.log http://localhost:59599 -o /tmp/test.results -D /tmp/test.headers
processes to test.results with a login of "fsapp1-42029-k8dgm19o-rahj6v":
test.results
login
datetime;date;login;platform;platform_name;publisher_name;rtype;mime;print_identifier;online_identifier;title_id;doi;publication_title;publication_date;unitid;domain;on_campus;log_id;ezpaarse_version;ezpaarse_date;middlewares_version;middlewares_date;platforms_version;platforms_date;middlewares;title;type;subject;geoip-country;geoip-latitude;geoip-longitude;host;ezproxy-session;url;status;size 2020-03-29T19:51:53+00:00;2020-03-29;fsapp1-42029-k8dgm19o-rahj6v;oclc-fs;OCLC Firstsearch;;SEARCH;HTML;;;;;;;;firstsearch.oclc.org;Y;00d54a84c9b665a46704570b1f8d366afc77f001;;;6e43d8b;2021-04-13;909d570;2021-04-14;filter, parser, deduplicator, istex, crossref, sudoc, hal, enhancer, geolocalizer, cut, on-campus-counter, qualifier, anonymizer;;;;;;;192.168.0.1;wZOoMiUxUCi1pBa;https://firstsearch.oclc.org:443/WebZ/FSQUERY?format=BI:next=html/records.html:bad=html/records.html:numrecs=10:sessionid=fsapp1-42029-k8dgm19o-rahj6v:entitypagenum=2:0:searchtype=basic;200;66621
when the expected login would be "USERNAME@pitt.edu":
datetime;date;login;platform;platform_name;publisher_name;rtype;mime;print_identifier;online_identifier;title_id;doi;publication_title;publication_date;unitid;domain;on_campus;log_id;ezpaarse_version;ezpaarse_date;middlewares_version;middlewares_date;platforms_version;platforms_date;middlewares;title;type;subject;geoip-country;geoip-latitude;geoip-longitude;host;ezproxy-session;url;status;size 2020-03-29T19:51:53+00:00;2020-03-29;USERNAME@pitt.edu;oclc-fs;OCLC Firstsearch;;SEARCH;HTML;;;;;;;;firstsearch.oclc.org;Y;00d54a84c9b665a46704570b1f8d366afc77f001;;;6e43d8b;2021-04-13;909d570;2021-04-14;filter, parser, deduplicator, istex, crossref, sudoc, hal, enhancer, geolocalizer, cut, on-campus-counter, qualifier, anonymizer;;;;;;;192.168.0.1;wZOoMiUxUCi1pBa;https://firstsearch.oclc.org:443/WebZ/FSQUERY?format=BI:next=html/records.html:bad=html/records.html:numrecs=10:sessionid=fsapp1-42029-k8dgm19o-rahj6v:entitypagenum=2:0:searchtype=basic;200;66621
Fixed in - https://github.com/ezpaarse-project/ezpaarse-platforms/pull/387
Verified resolved in 3.6.5.
The OCLC FirstSearch parser sets
result.login
: https://github.com/ezpaarse-project/ezpaarse-platforms/blob/1b1b4f018fe55b793de6127620d75f652113eb5f/oclc-fs/parser.js#L30 https://github.com/ezpaarse-project/ezpaarse-platforms/blob/1b1b4f018fe55b793de6127620d75f652113eb5f/oclc-fs/parser.js#L144No other parser was observed to do this.
The effect of the assignment of
result.login
is that the actual login of the user, if present in the log, is overwritten.Consider
test.log
as:processed via:
processes to
test.results
with alogin
of "fsapp1-42029-k8dgm19o-rahj6v":when the expected
login
would be "USERNAME@pitt.edu":