Open singhpratyush opened 7 years ago
Or maybe something like this -
import { LoklakHarvester } from 'loklak_scrapers_js';
export default class MyLoklakHarvester extends LoklakHarvester {
constructor () {
super({
backend: 'http://loklak.org',
workers: 4,
...
});
}
onHarvestStart = (backend, query) => {
...
}
onHarvestComplete = (backend, query, messages) => {
...
}
onHarvestError = (backend, query, error) => {
...
}
...
}
...
let harvester = new MyLoklakHarvester();
harvester.start();
...
harvester.stop();
@Achint08 @djmgit @hemantjadon @kavithaenair @SKrPl @vibhcool: What do you think about this? Is this approach reasonable to run loklak_scrapers_js
wherever a web page is open?
@singhpratyush , I have a doubt, are we discussing about creating restful API or http api or a javascript library for scrapers?
On a whole, you can think it as a npm
package which would allow scraping and submitting data to loklak.
@singhpratyush I agree with this approach we can make the loklak_scrapper_js
totally configurable and more usable this way 👍
ok, got it, but there is issue. the multithreading task shall be handled on loklak that are using this. But node.js is not good at multithreading performance.
what javascript does best is scraping and dealing with javascript running on webpages.
I haven't mentioned anything about multithreading here. I guess you got the idea from workers
.
This is just a raw thing that I just came up and needs to be discussed before proceeding. It can be the number of simultaneous requests that are made to the services (Twitter, Github, etc.).
Issue Description
Issue type: Parent issue
As of now, this JS has to be bundled so that it can be used in other projects and even then, the functions have to be manually imported.
It would be good to have an API of the following type or similar -
This would facilitate usage of
loklak_scrapers_js
in many projects and also allow an easy, plug and play interface for any website.