Giglium / vinted_scraper

A very simple Python package that scrapes the Vinted website to retrieve information about its items.
MIT License
15 stars 3 forks source link

Add support for proxy #17

Closed Hundred-Killer closed 9 months ago

Hundred-Killer commented 10 months ago

Is it possible to connect a proxy server? I'm not sure yet, but I think that vinted blocks frequent requests, right?

Giglium commented 10 months ago

Hi, If you perform too many requests to Vinted you will receive a 429 status code, and you will have to wait a little bit.

Reading the request documentation you can use environment variables http_proxy, https_proxy, no_proxy, and all_proxy.

To improve the package it will be easy to modify the code to add support for proxy since we only need to pass the proxy object. But I don't have a proxy or I don't know a free service to get a proxy for testing. I will convert this issue to from a request for info to an enhancement.

Hundred-Killer commented 10 months ago

This is really a very cool tool for scraping vinted, I hope you will continue to develop it, thank you

Hundred-Killer commented 10 months ago

what about aiohttp lib, why you dont used this for async requests?

Giglium commented 10 months ago

To be honest, it was actually to speed up the development of the V2 version so I go with the request package.

For now, I'm using threads to simulate async requests but they don't perform well like real async requests, of course! For this reason, I was thinking of implementing async requests in the future and I was thinking of using httpx only because it offers both sync and async requests, so it will be easier for those who are new to asynchronous programming. On the other hand, aiohttp has better performance that is the thing that normally I like more.

Probably I will take the final decision based on the usage of the package by the community.

Giglium commented 9 months ago

With version 2.1.0 I directly expose the proxies attribute on the obj constructor.

An example:

import vinted_scraper.VintedScraper

def main():
    proxies = {
        'http': 'http://10.10.1.10:3128',
        'https': 'http://10.10.1.10:1080',
    }
    scraper = VintedScraper("https://www.vinted.com", proxies=proxies)
    params = {"search_text": "board games"}
    scraper.search(params)

if __name__ == "__main__":
    main()