-
```
What steps will reproduce the problem?
1.
SLES 11.3 with slightly patched 3.16 kernel
Linux memcached9 3.16.3-4.1.100-default #1 SMP Thu Sep 18 06:32:16 UTC 2014
(d2bbe7f) x86_64 x86_64 x86_64 GN…
-
With browsertrix-crawler, a user can use `combineWARC` to write contextual information defined in the `warcinfo` property into the destination warc. When the warc is read, the fields defined in the pr…
-
This seems to happen randomly.
Today I searched for _alzheimer_ and got results for _ehlers-danlos syndrome_. Last week I made a chemical structure search and got results for _cadasil_ (look at the lo…
-
```
Siguiendo la investigacion e tracebacks abtenidos con el crawler [0], teniamos
en cdpedia.log dos tracebacks:
Traceback (most recent call last):
File "D:\tmp\pyinstaller-2.0\cdpedia\build\pyi.…
-
Design a reusable workflow that handles website crawling and posts to Artsdata Databus. This workflow should use a Ruby action to keep workflows DRY. It should be usable by anyone external to Culture …
-
In Safari 17.5 on macOS Sonoma 14.5 (using Userscripts 4.4.5), the script causes an annoying refresh:
1) On every search, it first loads Google with the “All” tab.
2) It then quickly switches the …
-
```
What steps will reproduce the problem?
1. Create a web-page with a malformed URL (or a protocol like mailto:)
2. Run the crawler on said website.
3. Crash and burn at line 89 in WebURL.java - this…
-
```
What steps will reproduce the problem?
1. Create a web-page with a malformed URL (or a protocol like mailto:)
2. Run the crawler on said website.
3. Crash and burn at line 89 in WebURL.java - this…
-
### ZIM(s) location
https://library.kiwix.org/viewer#athena_fr_all_2024-05
### Recipe(s) URL
https://farm.openzim.org/recipes/athena_fr_all/
### Readers tested
- [ ] Kiwix-serve on iOS (iPad / iP…
-
Details on accessing web content behind paywalls...
http://www.ghacks.net/2016/02/26/read-articles-behind-paywalls-by-masquerading-as-googlebot/
It references two addons, RefControl and User Agent S…