Open hannesdatta opened 1 year ago
Personally, I would separate the different parts of collecting the web data into individual sections, such as creating a JSON file, the part with requesting to scrape (including the header) and setting a timer etc.
In each section, I would provide an explanation for each line of code, making it easier for others to understand the seperate lines.
Finally, I would combine all parts of the code, showing how they work together to create a complete web scraping script.
Sounds good. You can work on the issue if you're up to it. Comment here if you accept!
I accept!
So far I have split up the code into 8 sections. These sections are
For these different sections I have written down line by line what the code does. Expect for section 6 that did not need explanation because it was repetitive. Moreover, with the different parts that needed additional explanation I have made a light gray text box. Finally, I have made a musterd colored text box that explains in 6 steps how to find the tags, attributes etc. on the website.
I have one major question: Would you like me to not explain the code line by line but give a brief description of what the code does? This will create more space. This space I could possibly use for the challanges section.
Please inform me on what you think is essential and currently left out or what is currently included but is not really necessary to have in the cheat sheet.
I have made two versions. Both are included in the powerpoint file. Webscraping cheatsheet.pptx
Best, Bas
When scraping, you often need to use various types of code snippets over and over again.
The purpose of this issue is to create a "web scraping cheat sheet".
Desired features:
requests
vs.selenium
beautifulsoup
andselenium
Deliverable:
.md
files) to this file (but I could link it, too)Bonus point policy for https://odcm.hannesdatta.com: