apify / crawlee

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
https://crawlee.dev
Apache License 2.0
15.35k stars 658 forks source link

feature: keyvaluestores.listKeys() #277

Closed jonathanstanley closed 5 years ago

jonathanstanley commented 5 years ago

It seems there is no way to list keys using the SDK, which means we cannot get keys for local datastore through SDK, which guides me/others to either hack into the class or duplicate the SDK.

Just needs two functions

listKeys() using apify-client for KeyValueStore class (key_value_store.js ~L156)

listKeys() emulator for KeyValueStoreLocal class (key_value_store.js ~L311) probably something like:

listKeys() {

  return readdirPromised(this.localStoragePath)
    .then((files) => {
      return {"items":files);
      }
    });
}
mnmkng commented 5 years ago

Thank you for the suggestion @jonathanstanley . There's definitely room for improvement, but since, as you correctly point out, the workaround is quite simple, we can't promise any fast development in this matter.

A PR is always welcome though. Would you be willing to work on that?

jancurn commented 5 years ago

Related to https://github.com/apifytech/apify-js/issues/249

mnmkng commented 5 years ago

Implemented by keyValueStore.forEach()