rdemarqui / perfume_recommender

A perfume recommendation system
https://huggingface.co/spaces/rdemarqui/perfume_recommender
11 stars 0 forks source link

Fragrantica Scrapping #2

Open Kageyoshi7777 opened 3 months ago

Kageyoshi7777 commented 3 months ago

Hi, how did you managed to scrap data from fragnantica with all those limits? I was trying different approaches, but without good results. I'd like to scrap whole database, so around +90 perfumes

rdemarqui commented 2 months ago

Hi. Have you tried reboot selenium after each n pages scrapped? Example:

        # Save file each 70 requests
        if index%70 == 0 and index!=0:
            print(f"Sample: {index}")
            temp_dataframe.to_csv(temp_name, index=False)
            driver.quit()
            driver = webdriver.Chrome(options=options)

Here full code used in other project.

Generaly it works.