-
The current implementation for the scraping is cringe-worthy. Redoing it, which should result in a much reduced codebase, for the purposes for future maintenance.
silum updated
3 years ago
-
Moving toward TOM Toolkit compatibility, the current web scripting scripts should accommodate JSON output as well as strings to file
This will allow the remote broker output to be ingested directly…
-
As mentioned in #87, being able to retrieve from YouTube operational API YouTube Data API v3 already retrievable data isn't a priority. However in the future it may be interesting, as an alternative t…
-
I have created a program that scrapes timesjobs web page using BeautifulSoup and retrieves Python jobs that were posted a few days ago. It is just a beginner project so if anyone would like to contrib…
-
This is a structure of courses I propose we should gravitate towards. As of now, it is a rough structure which will get more detailed over time - splitting, merging, renaming, etc. is expected as part…
-
Reviewed Medical Image Datasets:
1. NIH Clinical Center:
Chest X-Ray Dataset (ChestX-ray8): Contains over 100,000 frontal-view X-ray images of 30,805 unique patients with 14
disease labels…
-
### Notícias Falsas
- [x] **Boatos**: entender, programar e realizar web scraping da página de verificação de notícias falsas [Boatos.org](boatos.org)
- [x] **G1 Fato ou Fake**: entender, programar …
-
First pass of the web-scraping component should do the following:
- Callable from external scheduler component (ref Issue #1 - _Create scheduler component_)
- Starts up, does it's job, then shuts do…
-
This ticket is based on the [discord discussion](https://discord.com/channels/804053880266686464/963865665096257566/1066387135353081896).
There are some areas where the playwright implementation co…
-
The PR #5 implements the Google Custom Search API, the stored data does not include
* AdWords data.
* Total **number of links** on the first page
* The HTML code of the first page
There are sev…