-
I'm using Imager and integrating with an S3 bucket. Everything works really well when the local cache folder at `public/imager` is populated, but the files in there are being removed unexpectedly caus…
-
Hello, I found strange things on dataset maximum age is 2014 and minimum age is -31.
I tried using your code in
https://github.com/yu4u/age-gender-estimation/blob/master/create_db.py in line 34 an…
-
Needs investigation
some leads:
- kubernetes ?
- double check if something is holding up the async calls in the crawler
- more servers?
- different way of submitting batches/multithreading
-
Today I noticed that the JSON-LD example for MedicalCause shows it can be embedded within MedicalCondition using the "cause" property:
![image](https://user-images.githubusercontent.com/58438178/1186…
-
This is half a note to myself, about turning this codebase, or at least the CLI portion, working for at least one non-English Wikipedia (Hebrew). Currently the CLI does not crash, but extracts zero co…
-
```
foo.hpp:
class foo
{
};
---
foo_fwd.hpp:
class foo;
---
baz.hpp:
#include "foo.hpp"
// do_something_with_foo
---
main1.cpp:
#include "foo_fwd.hpp" // works
void x(foo&);…
-
We are currently running feed scraping code that is a few years old. Feed scraping for us just means discovering the set of rss feeds that cover all syndicated stories for a given site url. So feed …
-
Being comprehensively indexed by search engines such as Google is a substantial benefit for DataONE and DataONE Members. Ideally, the whole variety of information (datasets, people, portals, metrics, …
-
Hey guys, I've been checking out Pagefind and it works pretty great! I integrated it with my tiny website and it works pretty much flawlessly. The installation was super easy and the UI is slick, fast…
-
Hello,
I use the last version of simplecrawler.
When crawling a site my program stops and displays this error :
```
_http_outgoing.js:492
throw new TypeError('The header content contains inv…