datatogether / archivertools

Python package for scraping websites into the Data Together pipeline via morph.io
GNU Affero General Public License v3.0
6 stars 1 forks source link

Finalize and test customcrawls data together endpoint #17

Open ebenp opened 6 years ago

ebenp commented 6 years ago

Associated with https://github.com/datatogether/roadmap/issues/63

jeffreyliu commented 6 years ago

TODO: Create a test scraper on morph that has saved various test files of different types onto the morph db tables, and then ensure that they are replicated correctly on the DT side.

test files - feel free to add suggestions: an image, text file, large binary, duplicate files, compressed file

b5 commented 6 years ago

yo yo! So quick update on this, we now have an operational endpoint at api.archivers.co/customcrawls read for testing!