htrc / htrc-feature-reader

Tools for working with HTRC Feature Extraction files
37 stars 12 forks source link

Add argument for EF endpoint #23

Closed JaimieMurdock closed 4 years ago

JaimieMurdock commented 6 years ago

If a user has a complete local mirror, they should be able to use that instead of the ''data.analytics.hathitrust.org" URL for the download_file and FeatureReader DL_URL.

This also could be used to check a local cache if the json files are already downloaded.

organisciak commented 6 years ago

Why not just load the regular way, giving FeatureReader a file location or iterable of file locations?

Is your request: 1) to combine the hathi id -> file path with local paths, so you can use just the ids on your local system (if you have the exact same pairtree structure) or 2) to allow changing the URL to other remote locations?

I suppose there's possible merit to both readings...