-
Hello,
I would like to add an example showcasing how a distributed web-crawler can be made using `Golang` and a `Kafka` cluster on `Kubernetes`. Some ideas:
- A web-extension which collects the …
-
Create a tool that scrapes sample urls of an online dictionary and isolates the fields related to dictionary entries like part of speech, enumerated definitions, example sentences, word name, multiple…
-
-
# Using Playwright on Heroku - Playwright Community 🎭
Setup and usage of the Playwright Heroku buildpack to run Chromium and Firefox on the Heroku Ubuntu stack.
[https://playwright.tech/blog/running…
-
When urls (or parts thereof) are converted to a string, they're always escaped:
```python
>>> url = furl('foo.bar/fire truck?hello world=#hi there')
>>> str(url)
'foo.bar/fire%20truck?hello+worl…
-
Recognise Aminet download URLs as a specific link type and display the list of mirrors as we do for scene.org.
Note that aminet.net also provides an info page much like scene.org does, e.g. http://am…
-
Today, someone needs to update Config.mk and `make release` manually. It would be nice if there was some glue automation that automatically published new versions of vasmm68k whenever a new version is…
-
This issue is for discussion/documentation of crawling site maps for story URL discovery.
I'm creating it in the rss-fetcher repo, since I think:
1. We will likely want the result to feed into the…
-
Since I'm new to this project, I'm curious why it's preferable to pull the data types for the SyH-DR files by parsing the `SyH-DR Codebook.pdf` file rather than the [`SyH-DR Data Dictionary.xlsx`](htt…
-
Videos are not loading.