-
Run redis on the instance which runs web server and let it act as a cache layer above database.
This would fasten the data retrieval as database instance is run on difference instance.
Basically, …
zakaf updated
5 years ago
-
You can use several online tools and services to handle HTTP redirects and report the final destination URLs and any HTTP status codes encountered during the redirection process. Here are some tools a…
-
### Problem
Right now, `/learn` is limited to learning local files. It would be nice if we removed the need for locally downloading knowledge data and instead implemented support for `/learn` to di…
dlqqq updated
10 months ago
-
```
There is feature implemented in trunk -- punksearch dumps crawling statistics
into
PUNKSEARCH_HOME/stats directory.
These are simple csv files which can be vizualised in web interface with some …
-
Web apps often have a large chunk of their functionality hidden beyond a login screen. It'd be cool if there was a way to fill in these credentials while crawling so that we can follow up on these pag…
-
Is it possible to scale the crawler module and/or search module across multiple computers, all concurrently operating on the same data set? (similar to Elasticsearch, for example). If not, a work-arou…
-
Hi,
When I used any models method, I recived error :`UnicodeEncodeError: 'UCS-2' codec can't encode character '\U0001f4dc' in position 0: Non-BMP character not supported in Tk` (also in position 154…
-
(WEB REPORT BY: turingwept REMOTE: 206.221.180.138:7777)
# Revision
faa86c17b73fd48da44641cc92d10729eb0bd25c
# Description
As a diona nymph its annoying to scroll between object tab and diona t…
-
The first task is defining and expressing the **forcus crawling** specification.
The second subtask will be implementing that specification in sparkler.
Currently, we have support for URL based fo…
-
For instance: http://rdfdata.eionet.europa.eu/eurostat/void.rdf#env_rwat_rbd
This URI appears in very many documents that are all the same. This is because, I guess, you crawl the web for every URI t…