Open jdpedrie opened 2 days ago
Thanks @jdpedrie definitely worth considering Would you be interested in submitting a PR? This would make it easier to compare with what we currently have. BTW do you use URLFrontier? Would be great to hear about it if possible
Ah, that's a good idea. Did that in #111.
I'm currently evaluating url frontier for use in a new project. We've used storm crawler in production for a number of years, but it's still a version from prior to the implementation of this project.
Is the reference implementation suitable for production use?
I'm currently evaluating url frontier for use in a new project. We've used storm crawler in production for a number of years, but it's still a version from prior to the implementation of this project.
Great to hear you use StormCrawler! (even more curious about what you sue it for, scale etc...). What backend do you currently have with it?
Is the reference implementation suitable for production use? I haven't used it in production. I know that OpenWebSearch use URLFrontier but with a different implementation. @klockla uses the RockDB one. Not sure about @zaibacu
We use it to power our news search feature in Freespoke. It's backed by elasticsearch.
Hello,
I'd like to propose that in the next major version of this project, the API definition be modified to follow conventions for protocol buffers established in AIP and protolint.
Some of the changes I made:
google.protobuf.Empty
.UNSPECIFIED
. (https://google.aip.dev/126)google.protobuf.Timestamp
.Thanks for providing this API and the reference implementation!
A sample: