-
If a WARC request record contains and overlong and truncated HTTP request header line (`GET /path HTTP/1.1`) HttpRequestMessageParser throws an exception which causes that the request record is not tr…
-
I'm conducting trials with ProofWriter and stumbled upon an instance where, despite the formalization appearing accurate, it's yielding inaccurate results.
The Problem is is `ProofWriter_RelNoneg-…
-
While working on updating the docstring in agents.py, I noticed some discrepancies within the code. There were some differences between the algorithms [pdf](https://github.com/aimacode/aima-pseudocode…
-
I've found several plugins that perform this awesome work of making the web just a little less sucky.
It seems like everybody maintains their own set of rules and their own stripping code. I'm look…
chmac updated
2 years ago
-
Plan and execute an ITS "bootcamp" similar to @SMJ's Plan 9 bootcamps: https://sdf.org/plan9/
Collect ideas here.
-
Hi. I'm a search engine guy, and I'm very interested in a well-tested list of strippable CGI args to reduce the work my crawler has to do. I tried to algorithmicly build a list by taking the top 1000 …
-
I was attempting to talk my boss through using `py-spy top` on his long-running process ("ok type ps, now note the pid of your python process, then..."), and it struck me that almost all of the time t…
-
The Wumpus code does not display the proper colours for each of the glyphs being used.
-
According to the LSP description of "`workspace/symbol`" requests (reading from the 3.16 spec), passing an empty query string is a request for all symbols in the workspace:
```
/**
* The paramet…
-
Hi,
Thanks for sharing the programming example - https://github.com/cocrawler/cdx_toolkit#programming-example
I wanted to ask if there is a way to feed in a list of URL's and retrieve their object…