google / corpuscrawler

Crawler for linguistic corpora
Other
190 stars 56 forks source link

Adding New URLs #74

Closed Mounika2405 closed 3 years ago

Mounika2405 commented 4 years ago

Hi,

Can we fetch data from URLs not mentioned in the existing code by adding custom functions? Also, does it not support the English language('en' not mentioned anywhere in the list of supported languages)?

Thanks

brawer commented 4 years ago

I’ve left Google, so I can’t really comment on their policies, but it’s an open-source project, so of course you can add custom function and send a pull request to get your changes integrated into the project. English is currently unsupported because I didn’t need an English corpus when I started this project (while still at Google); again, contributions would be very welcome. Best, — Sascha

sffc commented 4 years ago

Yes, please contribute PRs with new URLs that would be useful to the audience of corpuscrawler.