-
In lollypop the album art can either come from
1) the metadata itself or
2) an image in the folder
All my music is organized with 2) because that way the files are separated from the artwork an…
-
* Name of dataset: brWaC (Brazilian Portuguese Web as Corpus)
* URL of dataset: https://www.inf.ufrgs.br/pln/wiki/index.php?title=BrWaC
* License of dataset: not specified
* Short description of da…
-
The Value element specifies the Type attribute is required but provides absolutely no guidance on what the data types are. It took more than an hour of searching MSDN/Learn, searching the web, crawlin…
-
As of now, Colly parses URLs with Go stdlib's `net/url` parser. This parser is somewhat simple, and doesn't do some quirks that browsers do. Since Colly is a web crawling framework, in order to be abl…
WGH- updated
9 months ago
-
[2014-07-23 13:02:51,591][WARN ][org.apache.tika.mime.MimeTypesReader] Invalid media type configuration entry: application/dita+xml;format=map
org.apache.tika.mime.MimeTypeException: Invalid media typ…
-
nfpm
https://nfpm.goreleaser.com/
https://github.com/goreleaser/nfpm
https://github.com/burningalchemist/action-gh-nfpm
`[.nfpm.yaml}`
```
name: go-pot
# We'll use a template for arch
…
-
### Context
Prior reading: https://anjackson.net/2023/06/09/what-makes-a-large-website-large/
> The simplest way to deal with this risk of temporal incoherence is to have two crawls. A shallow a…
-
I am going to learn Node.js and Crawling based on your great app, but I find when I set up app.js in Eclipse, it shows error message like:
Express
500 Error: spawn UNKNOWN
at exports._errnoException (…
-
The [Fragmention specification](https://indieweb.org/fragmention) does not explicitly say anything about hashbang (#!) URLs or similar common (including legacy) single-page-app routing patterns, and i…
-
- sEt1: compile a list of scientists, e.g. by crawling uni websites **(2 steps)**
- sEt2: chose a source of information for publications (e.g. personal web sites, google scholar, ISI web of science)…