Open PeDiot opened 1 year ago
Hi, thanks for the compliment. Sorry i close the issue for a miss-click. I din't find right now a params that control the number of item returned. For now I loop throw the pages since the search API is paginated.
For example:
import vinted_scraper.VintedScraper
def main():
scraper = VintedScraper("https://www.vinted.com")
params = {
"search_text": "board games"
# Add other query parameters like the pagination and so on
}
for i in range(0, 10):
params["page"] = i
items = scraper.search(params)
if __name__ == "__main__":
main()
thank's for the tip !
Strangely - getting the same issue. There is a mismatch in the number of items i am able to extract & the number of items that appear using the same request on the website. Have you experienced this? Adding pagination did not help. My code: `import json from vinted_scraper import VintedScraper
def to_serializable(obj): if isinstance(obj, list): return [to_serializable(i) for i in obj] elif hasattr(obj, "dict"): return {key: to_serializable(value) for key, value in obj.dict.items()} else: return obj
def main():
scraper = VintedScraper("https://www.vinted.fr")
# Define search parameters with page 1 as the starting point
params = {
"search_text": "padel racket",
"brand_ids": [48801, 372642, 14, 15453, 689757],
"price_from":20,
"currency":"EUR",
"order": "newest_first",
"page": 1 # Start with the first page
}
all_items = []
while True:
# Perform the search
items = scraper.search(params)
if not items:
print(f"No items found on page {params['page']}. Ending search.")
break # Exit loop if no more items are found
print(f"Page {params['page']}: Found {len(items)} items.")
all_items.extend(items) # Add found items to the all_items list
params["page"] += 1 # Move to the next page
if all_items:
# Convert the items to a JSON-serializable format
serializable_items = [to_serializable(item) for item in all_items]
# Save the results to a JSON file
with open('vinted_search_results.json', 'w', encoding='utf-8') as f:
json.dump(serializable_items, f, ensure_ascii=False, indent=4)
print(f"Saved {len(all_items)} items to 'vinted_search_results.json'")
else:
print("No items were found during the search.")
if name == "main": main() `
Checking the browser network tab when I change a page in the site I see that Vinter uses the param per_page=96
. I don't have time right now, but we can investigate if this parameter controls the number of items the API returns. Is this one the reason why you see two different results?
@lo1gr I suggest using VintedWrapper
instead of VintedScraper
, the VintedWrapper
object will return the JSON of the API without converting it to the model, so you don't have to re-transform it again.
I'm just starting to use your library, which seems awesome btw. I wanted to know how to change the number of items returned by the
search
method ofVintedScraper
? I've tried to add apage-size
param in the search params, but it did not work. Thanks for ur help !