Closed mvolz closed 6 years ago
Are you using the 4.0 version on Docker Hub or the latest master?
And can you reproduce this just by running translation a few times in succession, or does it only happen at different times? Using the latest master of translation-server, I'm not seeing it just running translation repeatedly.
Latest master. I just got it like three times in a row locally. It's happening in production too which is how I noticed. :/
Latest master with translation-server and with the translators as well.
OK, just got it here too, and getting it in the client via the connector too. So not translation-server specific, and possibly just a translator issue.
<td class="a-size-base">
--
| <div class="a-row">
| <span class="a-size-medium">J.K. Rowling
|
| <span class="a-color-secondary">(Author)</span>
|
| </span>
| </div>
| <div class="a-row">
| <span class="a-text-bold">› </span><a class="a-link-normal" href="/J.K.-Rowling/e/B000AP9A6K/ref=dp_byline_cont_pop_book_1">Visit Amazon's J.K. Rowling Page</a>
| </div>
| <div class="a-row a-spacing-base">
|
| </div>
| <div class="a-row">
| <span class="a-size-small">
| <a class="a-link-normal" href="/s/ref=dp_byline_sr_pop_book_1?ie=UTF8&text=J.K.+Rowling&search-alias=books-uk&field-author=J.K.+Rowling&sort=relevancerank">search results</a> for this author</span>
So it's getting the second span with a link instead of the first one.
The author name trigger I guess is in the link to the search results, this line isn't specific enough
if(!authors.length) authors = ZU.xpath(baseNode, './/span[@class="contributorNameTrigger"]/a[not(@href="#")]');
217 from https://github.com/zotero/translators/blob/master/Amazon.js
Possible we're getting different pages, but at least for me it's this line:
https://github.com/zotero/translators/blob/528296d74cd2ce66641d5be581f9af178bf46d9e/Amazon.js#L219
And it's getting the first result instead of the correct second one.
Anyone more familiar with the Amazon translator have any ideas here?
We can change this line to:
if(!authors.length) authors = ZU.xpath(baseNode, './/div[@id="sitbReaderAuthorBlock"]/a[contains(@href, "field-author=")]');
But I have no idea if the relevant field-author
link always appears within a sitbReaderAuthorBlock
.
Running tests, this doesn't appear to cause any problems, though it's hard to know on what pages each of these many XPaths is used.
(Unrelatedly, we seem to no longer get abstracts.)
The author extraction in amazon translator is fragile and depending on what page amazon shows you. For example, I couldn't replicate the problems you see...
(Unrelatedly, we seem to no longer get abstracts.)
I see abstracts in Zotero, but they don't work in Scaffold. Or do you see something different?
For example, I couldn't replicate the problems you see...
I couldn't replicate in Scaffold — only in the browser, and only some of the time. At least on this page, the order of the two results for that unmodified XPath seems to be random (perhaps due to a microservice-related race condition).
I see abstracts in Zotero, but they don't work in Scaffold.
Oh, yeah, same.
Fixed (hopefully) in https://github.com/zotero/translators/pull/1602
So I'd say 3/4 times with the page https://www.amazon.co.uk/Harry-Potter-Deathly-Hallows/dp/1408855712/ instead of the author J.K. rowling I get "search results" as her name.
[{"itemType":"book","creators":[{"firstName":"search","lastName":"results","creatorType":"author"}],"notes":[],"tags":[],"title":"Harry Potter and the Deathly Hallows: 7/7","ISBN":"9781408855713","date":"1 Sept. 2014","publisher":"Bloomsbury Children's Books","edition":"01 edition","numPages":"640","language":"English","place":"London","libraryCatalog":"Amazon","shortTitle":"Harry Potter and the Deathly Hallows"}]
But this doesn't happen every time so it's a big difficult to debug. I'm using translation-server. Any ideas?