maria-antoniak / goodreads-scraper

A Python scraper for Goodreads books and reviews.
GNU General Public License v3.0
275 stars 83 forks source link

Feature Additions and Improvements: Goodreads Scraper #43

Open GrimmXoXo opened 6 months ago

GrimmXoXo commented 6 months ago

Description

Hi there, I was working on a Book recommendation system project and wanted to get data for my models. I came across your repository and liked the work, so I decided to improve upon this and implement it in my project. If it helps, I would also like to contribute a bit towards Goodreads Scraper.

Changes Made

List out the key changes made in this pull request, including any new features added, bugs fixed, or improvements made.

Files Modified

List the files modified/added in this pull request.

Checklist

Ensure that the following tasks have been completed:

This is the output for a particular Category/Collection of goodreads database_scraper

This is the Output for the json files which we get from get_books.py


{
    "book_id": "320",
    "cover_image_uri": "https://images-na.ssl-images-amazon.com/images/S/compressed.photo.goodreads.com/books/1327881361i/320.jpg",
    "book_title": "One Hundred Years of Solitude",
    "top_5_other_editions": [],
    "format": ["417 pages, Mass Market Paperback"],
    "publication_info": ["First published January 1, 1967"],
    "authorlink": "https://www.goodreads.com/author/show/13450.Gabriel_Garc_a_M_rquez",
    "author": "Gabriel García Márquez",
    "num_pages": ["417"],
    "genres": ["Fiction", "Magical Realism", "Literature", "Fantasy", "Novels", "Historical Fiction", "Spanish Literature"],
    "num_ratings": "976523",
    "num_reviews": "46268",
    "average_rating": "4.11",
    "rating_distribution": {
        "5": "480,125",
        "4": "260,614",
        "3": "140,529",
        "2": "57,483",
        "1": "37,772"
    }
}
maria-antoniak commented 6 months ago

Hi there! This looks wonderful; thank you for all your work! I haven't had time to test yet but will try to do this ASAP. If we integrate your changes, would you like to be credited in the README? If so, how would you like to be credited (username, name, something else)? These are significant changes that we haven't had time to make ourselves, and I want to make sure you get the credit you deserve.

GrimmXoXo commented 6 months ago

Hi there! This looks wonderful; thank you for all your work! I haven't had time to test yet but will try to do this ASAP. If we integrate your changes, would you like to be credited in the README? If so, how would you like to be credited (username, name, something else)? These are significant changes that we haven't had time to make ourselves, and I want to make sure you get the credit you deserve.

Hi! If this goes well this will be my first contribution ^_^ I would love my username(GrimmXoXo or GM) appearing on the contribution but please make sure that this works well.

GrimmXoXo commented 6 months ago

That would be all i think, added a new script to fetch reviews,added a log file for reviews(can also be added to book_id,book_details) to debug problems,added readme inside the folder to use the new script with working example.