imageLinks - Githubissues

jericoxnavarro commented 3 years ago

Same issue and I think if I GET request an http://localhost:3000/id/${styleID}/prices in like a 20 request that error will occur.

HTTPError: Response code 403 (Forbidden)
    at Request.<anonymous> (/mnt/d/Project/Hypekicks-web/backend/node_modules/got/dist/source/as-promise/index.js:117:42)
    at processTicksAndRejections (internal/process/task_queues.js:97:5) {
  code: undefined,
  timings: {
    start: 1605155076168,
    socket: 1605155076231,
    lookup: 1605155076231,
    connect: 1605155076231,
    secureConnect: 1605155076231,
    upload: 1605155076231,
    response: 1605155076340,
    end: 1605155076342,
    error: undefined,
    abort: undefined,
    phases: {
      wait: 63,
      dns: 0,
      tcp: 0,
      tls: 0,
      request: NaN,
      firstByte: 109,
      download: 2,
      total: 174
    }
  }
}
Error: Could not connect to StockX while searching '
    at Object.getPrices (/mnt/d/Project/Hypekicks-web/backend/scrapers/stockx-scraper.js:93:23)
    at processTicksAndRejections (internal/process/task_queues.js:97:5)
Error: Could not connect to Flight Club while searching '555088-105' Error: 
    at Object.getPrices (/mnt/d/Project/Hypekicks-web/backend/scrapers/flightclub-scraper.js:76:31)
    at processTicksAndRejections (internal/process/task_queues.js:97:5)
Error: Could not connect to Goat while grabbing pictures for '555088-105' Error: 
    at Object.getPictures (/mnt/d/Project/Hypekicks-web/backend/scrapers/goat-scraper.js:104:19)
    at processTicksAndRejections (internal/process/task_queues.js:97:5)
HTTPError: Response code 403 (Forbidden)
    at Request.<anonymous> (/mnt/d/Project/Hypekicks-web/backend/node_modules/got/dist/source/as-promise/index.js:117:42)
    at processTicksAndRejections (internal/process/task_queues.js:97:5) {
  code: undefined,
  timings: {
    start: 1605155077174,
    socket: 1605155077499,
    lookup: 1605155077499,
    connect: 1605155077499,
    secureConnect: 1605155077499,
    upload: 1605155077499,
    response: 1605155077966,
    end: 1605155077967,
    error: undefined,
    abort: undefined,
    phases: {
      wait: 325,
      dns: 0,
      tcp: 0,
      tls: 0,
      request: NaN,
      firstByte: 467,
      download: 1,
      total: 793
    }
  }
}
Error: Could not connect to Goat while searching '555088-105' Error:
    at Object.getPrices (/mnt/d/Project/Hypekicks-web/backend/scrapers/goat-scraper.js:64:19)
    at processTicksAndRejections (internal/process/task_queues.js:97:5)

druv5319 commented 3 years ago

Just redownloaded the API and tested it out and works fine. Might be because you've sent too many requests? Does it still work and how often does this show up?

jericoxnavarro commented 3 years ago

vid Yup if I sent too many requests the error will show up. But no need for the image I already fix it, instead of the image links in GOAT, I manage to get the images in stockX. What my concern now is how can I get like a 1000 results or more? Like if I search nike in /search/:shoe I want to get a 1000 results. Cuz I want to store it to my DB for me to optimize the response time in my API. BTW thank you for your response.

druv5319 commented 3 years ago

App looks great! Check out line 19 on the stockx scraper. There is a hits per page parameter where you can change its value. This value is limited so you’re gonna have the change the page number parameter along with it. I’m not sure exactly how to change the page number but I’ll figure out and send it to you later

jericoxnavarro commented 3 years ago

Thank you so much😇 I will try it

druv5319 commented 3 years ago

Will you be posting the project on GitHub? I’d love to show it off in my sneaks api repository

jericoxnavarro commented 3 years ago

Yup after I finish the project but for now, it is on private I will be setting it to the public after the project is done.

jericoxnavarro commented 3 years ago

It works thanks. I change it to 1000 and it responds like 900+ shoes. I have a Question haha, is the sneaks-api only get shoes? is there a way that the API will also get streetwears in stockX? BTW don't close this issue so that I can notify you if the project is done. Thanks!!

druv5319 commented 3 years ago

You can get streetwear products by deleting line 29 on the stockX scraper, however its very limited and it'll only show prices and streetwear products from stockX.

erikmaday commented 3 years ago

Hey, I've been working on a project with this API and it's great. I too was wondering about the number of results issue, and I was able to modify the search endpoint to accept a results count as a query parameter so that the endpoint would return up to that many results. Wondering if this would be worth a PR, or if it's even functionality you're interested in adding.

druv5319 commented 3 years ago

Yeah! Feel free to submit a PR. Thanks.

erikmaday commented 3 years ago

I also figured out a way to get around the API rate limit.

I only ran into it with the goat scraper. What I determined is if you change the User-Agent in the headers of the request, the requests will go through again.

For example, I switched it to: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:75.0) Gecko/20100101 Firefox/75.0 temporarily.

I may try writing something eventually to auto switch the user-agent if the connection is denied due to a rate limit

EDIT: use this package https://www.npmjs.com/package/random-useragent and instead of giving the User-Agent field in the header a hardcoded value, just generate a random one each time.

EDIT EDIT: that didn't work, because it would deny certain user agents and it got hard to control well enough the ones that came back. Instead I added this to the top of each scraper

const USER_AGENTS = [ 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/42.0.2311.135 Safari/537.36 Edge/12.246', 'Mozilla/5.0 (X11; CrOS x86_64 8172.45.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.64 Safari/537.36', 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_2) AppleWebKit/601.3.9 (KHTML, like Gecko) Version/9.0.2 Safari/601.3.9', 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/47.0.2526.111 Safari/537.36', 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:15.0) Gecko/20100101 Firefox/15.0.1', 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/51.0.2704.79 Safari/537.36 Edge/14.14393', 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.140 Safari/537.36 Edge/17.17134', 'Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; rv:11.0) like Gecko', 'Mozilla/5.0 (Windows NT 10.0; WOW64; Trident/7.0; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30729; MAARJS; rv:11.0) like Gecko' ]

and then where User-Agents needed to be set, I grabbed a random value using

'User-Agent': USER_AGENTS[Math.floor(Math.random() * USER_AGENTS.length)]

druv5319 commented 3 years ago

Thanks! Yeah i've also had the same issue. The sneaks app website hasnt been able to connect to Goat or FlightClub probably because of the rate limit on the user agent. I still haven't been able to find a workaround to this. Maybe a proxy might help, but I am unsure. Did the random user agent method work in most cases? If so, feel free to make a PR if it works fine. Edit: I just changed the User-agent for the goat and flightClub scrapers on my sneaks app website. Its able to connect to Goat now but it may stop working again.

druv5319 / Sneaks-API

imageLinks #11