tosdr / crawler.tosdr.org

ToS;DR Crawlers
11 stars 2 forks source link

Merge with tosback-crawler #2

Open michielbdejong opened 3 years ago

michielbdejong commented 3 years ago

Hi ToS;DR crawler, meet our Tosback crawler :) https://github.com/tosdr/tosback-crawler/issues/8

michielbdejong commented 3 years ago

@JustinBack hi! I think the last time we discussed this, we noted the two crawlers solve different parts of the crawling problem. Can we merge them together somehow?

Currently, we are using edit.tosdr.org in front of crawler.tosdr.org and that works well, but there are a number of advantages we can get from tosback-crawler:

JustinBack commented 3 years ago

It should be possible to merge them together, I'd just have to get into the code tosback-crawler uses.

Technically its possible to switch crawler.tosdr.org over to markdown as well, maybe for testing? I can migrate a specific crawler to makrdown to see how it goes, then merge it into tosback