-
When spalsh deal with many pages, it behaves so slow, I want to run splash on more port, like 8051,8052,8053 and so on. How I should do?
I run my splash on docker environment.
-
On this line: https://github.com/TeamHG-Memex/autologin-middleware/blob/64265fac56f51557880be0f7e3cc44f2b30c43c9/autologin_middleware/middleware.py#L78
They should be preserved. Originally I deleted t…
-
ACHE does not give nonzero error codes. Therefore, celery currently reports all ended tasks as successful, regardless of their log output. Instead, it should
preferably: grep the log or results for …
-
Been running into OoB issues when the intermediate result vectors are saved, [specifically here](https://github.com/mryoo/pooled_time_series/blob/master/src/main/java/gov/nasa/jpl/memex/pooledtimeseri…
-
It makes sense to assing more weight to recent samples in DupePredictor; together with https://github.com/TeamHG-Memex/undercrawler/issues/41 it should allow to handle a case when crawler first visits…
kmike updated
8 years ago
-
It is unclear that there is a docker container for image_space.
Would be great if we could make it explicit that there is one and then make it extremely easy for folks to find it and run it locally.
@…
-
We need to be able to export a domain for use by the crawling teams. The domain should be the aggregate of all of the trails within a domain (At some point we may want to be able to choose specific tr…
-
I havent' started autologin servers and got these exceptions:
```
py35 runtests: commands[4] | py.test --doctest-modules --cov=undercrawler undercrawler tests
========================================…
kmike updated
8 years ago
-
Say we want to add some cookies we got elsewhere to request: we set `request.cookies`. But the format is HAR, which is not a native python format. I think it would be convenient to allow setting cooki…
-
Below is snippet of my solr index with all required geographic details.
{"id":"/dev/com/10news/www/42FD5EBC90236148BA72BEC73D622543F282BAF46F05AED2566EE6D7D7BF2A67","Geographic_LONGITUDE":"139.75309…