Hi there! First off, very nice tool you have built.
I have noticed some things while trying to integrate your repo which may be of use to you.
I didn't check any code to see what was causing it but merely noticed some behavior.
I am using the module, not the argument parser to execute queries.
1- Proxies and timeout
For some reason the timeout does nothing and the program seems to stall well past the set timeout.
I am using free proxies scavenged from the internet so it is important that this works.
Even when you're using private proxies you would still expect this configuration to work.
2- .products() amount of results returned
I am using the following options:
argv['keyword'] = keyword;
argv['category'] = 'aps';
argv['country'] = 'US';
argv['number'] = conf.scrape_products_per_keyword; // set to 500
argv['bulk'] = true;
argv['proxy'] = proxies_formatted; // a list of proxies
argv['rating'] = [conf.product_min_rating, conf.product_max_rating]; // [3,5]
argv['sort'] = true;
argv['randomUa'] = true;
argv['timeout'] = conf.request_timeout_ms; // 1000
As you can see this has bulk=true, however it consistently returns only results from just one page (<50).
I resolved this by turning bulk off and manually iterating over the pages, perhaps this is due to the bad quality of the proxies, but shouldn't happen none the less.
3- request-promise
Request promise has been deprecated.
Hi there! First off, very nice tool you have built. I have noticed some things while trying to integrate your repo which may be of use to you. I didn't check any code to see what was causing it but merely noticed some behavior.
I am using the module, not the argument parser to execute queries.
1- Proxies and timeout For some reason the timeout does nothing and the program seems to stall well past the set timeout. I am using free proxies scavenged from the internet so it is important that this works. Even when you're using private proxies you would still expect this configuration to work.
Perhaps this can be considered https://medium.com/javascript-in-plain-english/use-promise-race-to-timeout-promises-6710cb0a3164
2- .products() amount of results returned I am using the following options:
As you can see this has bulk=true, however it consistently returns only results from just one page (<50). I resolved this by turning bulk off and manually iterating over the pages, perhaps this is due to the bad quality of the proxies, but shouldn't happen none the less.
3- request-promise Request promise has been deprecated.