Wingysam / Christmas-Community

Christmas lists for families
GNU Affero General Public License v3.0
234 stars 41 forks source link

503 on data load for Amazon items #2

Closed 004a closed 2 years ago

004a commented 2 years ago
Version Info:
Christmas Community: v1.27.0
Get Product Data: v1.27.0
Node: v15.14.0
PID: 72

I am receiving the following 503 error when attempting to add new items or refresh items from Amazon.

/usr/src/app/node_modules/get-product-name/sites/amazon.js:90
    if (!res.ok) throw new Error(`Res not ok. Status: ${res.status} ${res.statusText}`)
                       ^

Error: Res not ok. Status: 503 Service Unavailable
    at getter (/usr/src/app/node_modules/get-product-name/sites/amazon.js:90:24)
    at processTicksAndRejections (node:internal/process/task_queues:94:5)
    at async fetchData (/usr/src/app/node_modules/get-product-name/index.js:6:16)
    at async /usr/src/app/routes/wishlist/index.js:260:27
Wingysam commented 2 years ago

Do you have an example URL? This is working for me: https://www.amazon.com/XPG-Z1-3000MHz-Silver-AX4U300038G16-DSZ1/dp/B07HSZYNN5

It's possible that Amazon had a temporary outage. That's what HTTP 503 usually means. Can you access Amazon outside of Christmas Community from the IP of the instance?

004a commented 2 years ago

Currently trying with:

https://www.amazon.com/Metroid-Dread-Nintendo-Switch/dp/B097B1149G/137-2029418-8939011

or

https://smile.amazon.com/dp/B097B1149G/

With your URL, the app throws the same error.
I have verified the host machine's DNS configuration. Also a curl to the direct URIs succeeds but does throw an API error likely due to the curl useragent because it reports similarly using another host: curl "https://www.amazon.com/XPG-Z1-3000MHz-Silver-AX4U300038G16-DSZ1/dp/B07HSZYNN5"
A check of name resolution from within the container succeeds properly docker exec -it wishlist nslookup smile.amazon.com as well as utilizing getent: docker exec -it wishlist getent hosts smile.amazon.com.


curl "https://www.amazon.com/XPG-Z1-3000MHz-Silver-AX4U300038G16-DSZ1/dp/B07HSZYNN5"
<!--
        To discuss automated access to Amazon data please contact api-services-support@amazon.com.
        For information about migrating to our APIs refer to our Marketplace APIs at https://developer.amazonservices.com/ref=rm_5_sv, or our Product Advertising API at https://affiliate-program.amazon.com/gp/advertising/api/detail/main.html/ref=rm_5_ac for advertising use cases.
-->
<!doctype html>
<html>
<head>
  <meta charset="utf-8">
  <meta http-equiv="x-ua-compatible" content="ie=edge">
  <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
  <title>Sorry! Something went wrong!</title>
  <style>
  html, body {
    padding: 0;
    margin: 0
  }

  img {
    border: 0
  }

  #a {
    background: #232f3e;
    padding: 11px 11px 11px 192px
  }

  #b {
    position: absolute;
    left: 22px;
    top: 12px
  }

  #c {
    position: relative;
    max-width: 800px;
    padding: 0 40px 0 0
  }

  #e, #f {
    height: 35px;
    border: 0;
    font-size: 1em
  }

  #e {
    width: 100%;
    margin: 0;
    padding: 0 10px;
    border-radius: 4px 0 0 4px
  }

  #f {
    cursor: pointer;
    background: #febd69;
    font-weight: bold;
    border-radius: 0 4px 4px 0;
    -webkit-appearance: none;
    position: absolute;
    top: 0;
    right: 0;
    padding: 0 12px
  }

  @media (max-width: 500px) {
    #a {
      padding: 55px 10px 10px
    }

    #b {
      left: 6px
    }
  }

  #g {
    text-align: center;
    margin: 30px 0
  }

  #g img {
    max-width: 90%
  }

  #d {
    display: none
  }

  #d[src] {
    display: inline
  }
  </style>
</head>
<body>
    <a href="/ref=cs_503_logo"><img id="b" src="https://images-na.ssl-images-amazon.com/images/G/01/error/logo._TTD_.png" alt="Amazon.com"></a>
    <form id="a" accept-charset="utf-8" action="/s" method="GET" role="search">
        <div id="c">
            <input id="e" name="field-keywords" placeholder="Search">
            <input name="ref" type="hidden" value="cs_503_search">
            <input id="f" type="submit" value="Go">
        </div>
    </form>
<div id="g">
  <div><a href="/ref=cs_503_link"><img src="https://images-na.ssl-images-amazon.com/images/G/01/error/500_503.png"
                                        alt="Sorry! Something went wrong on our end. Please go back and try again or go to Amazon's home page."></a>
  </div>
  <a href="/dogsofamazon/ref=cs_503_d" target="_blank" rel="noopener noreferrer"><img id="d" alt="Dogs of Amazon"></a>
  <script>document.getElementById("d").src = "https://images-na.ssl-images-amazon.com/images/G/01/error/" + (Math.floor(Math.random() * 43) + 1) + "._TTD_.jpg";</script>
</div>
</body>
</html>
Wingysam commented 2 years ago

Could you try with setting the useragent to Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/94.0.4606.81 Safari/537.36 please? Also try accessing it in a browser if it's not on a VPS?

004a commented 2 years ago

Unfortunately this is on a headless server. Changing the curl useragent resulted in the same captured output. Passing the useragent in the request headers resulted in a binary stream.

Wingysam commented 2 years ago

Is it possible to try it from another IP? Is using ssh x forwarding an instance of firefox an option or do you prefer not to install the dependencies of firefox on the server?

If Christmas Community is the only thing accessing Amazon on the IP, it could have gotten your IP banned? Is the IP residential or commercial?

004a commented 2 years ago

The IP isn't banned and is commercial. I've installed lynx and am able to successfully browse amazon. Including the affected URL attempting to load within Christmas Community https://www.amazon.com/Metroid-Dread-Nintendo-Switch/dp/B097B1149G/137-2029418-8939011

Wingysam commented 2 years ago

Could you try updating to 1.27.1 please? I've changed the useragent on get-product-name to not be an old version of Vivaldi. Restarting the container should also work but the version display of get-product-name prior to the current version is broken.

004a commented 2 years ago

Updated and it works much better now. No longer throwing the 503s. Though it may need a little more tweaking.

Product name and image fill in properly. The price fills in blank if it's available. As seen with my Metroid Dread URLs.

The test URL for RAM you sent initially currently reports a price of $49.99 from Amazon's site but fills in to Christmas Community as "Currently Unavailable".

Wingysam commented 2 years ago

image Both products now appear to be working for me, what country is your server in and is it still broken for you?

004a commented 2 years ago

The product details and images still load properly when both refreshing data and pasting from a link. The prices do not fill in for amazon.com. The server is in a different country, but all of the other details fill in properly.

004a commented 2 years ago

It may be attempting to load price information. When I go to https://www.amazon.com/gp/product/B017P1QSC0 the page shows a price but refresh data fills in with "Currently unavailable."

Wingysam commented 2 years ago

Refreshing works as intended for me, could you add an environment variable and check a file that the library writes to see if/where the price data is if I add that to the library?

004a commented 2 years ago

Definitely. What’s the var name and expected value? And where’s the file located?

Wingysam commented 2 years ago

set GPD_AMAZON_HTML to the location to write the file, thanks :)

004a commented 2 years ago

Here's the output from a click on the refresh data button on the Metroid Dread entry. The file generated is 19689 lines.

https://zerobin.net/?8536271c16b35951#lxMkPTB/WzkH2LBvu+kJzbnBpdm7rljYOiExMK+dFlA=

Wingysam commented 2 years ago

Sorry I didn't get to it in time and now the link is dead, could you reupload it? thanks :)

vityav commented 2 years ago

Hello, I'm running into the same issue. Amazon links autofill everything but the price, a plain curl returns a notice about automated data access, and a curl with the specified useragent returns a never ending binary stream. I tried setting GPD_AMAZON_HTML in the .env file but didn't get any logs out when adding an amazon link. Does that variable work for the latest (today, 11/28/21) docker image? Are there any other logs I can check? All etsy links fail to gather names or prices, so it may be an issue on my end.

Edit: Interestingly, this link fills the price in with "Only 2 left in stock - order soon." So it's grabbing the content, just missing the mark. https://www.amazon.com/dp/B000TYDFRW/?psc=1

Wingysam commented 2 years ago

What are you setting the env variable to? It should be a path to save the html to.

vityav commented 2 years ago

I figured out the issue was that docker compose wasn't loading my .env file, and manually defining the GPD_AMAZON_HTML env variable in a shell in the container wasn't being read for some reason. Attached is the output. amazon.txt

Wingysam commented 2 years ago

Thank you! I should be able to get this working tomorrow.

vityav commented 2 years ago

Thanks! And for reference/searchability, that output was created adding this link: https://smile.amazon.com/dp/B07GBXMBQF

Wingysam commented 2 years ago

Sorry I wasn't able to fix it on the 30th, looking into this now.

Wingysam commented 2 years ago

This has been fixed in get-product-name #b60f9924.