Closed knox-academy closed 1 year ago
Mike McConnelly:
Dan Carter:
For issue 1, do we have any specific criteria for selecting the libraries? Should we consider factors such as popularity, ease of use, or compatibility with other tools we are using?
For issue 2, do we have any specific requirements for the format of the JSON data? Should we include all available information from the Hacker News website, or only select fields?
For issue 3, do we have any specific criteria for determining what constitutes duplicate data? Should we compare based on the entire article or only certain fields?
For issue 4, do we have any specific schedule in mind for running the script? Should we consider factors such as server load or peak usage times?
For issue 5, do we have any specific requirements for the S3 bucket? Should we consider factors such as security, accessibility, or cost?
For issue 6, do we have any specific testing criteria in mind? Should we consider factors such as edge cases, error handling, or performance?
For issue 7, do we have any specific documentation standards in place? Should we consider factors such as readability, completeness, or version control?
For issue 8, do we have any specific user guide requirements? Should we consider factors such as audience, language, or format?
Mike McConnelly:
Need to create a python script to scrape hacker news daily. It should save the data in json format. Not saved duplicates and use gthub actions running on a schedule. And save to an s3 bucket.