Closed organisciak closed 7 years ago
Code done and new tests passing: https://github.com/htrc/htrc-feature-reader/tree/online_read
Rather than using the Rsync subprocess, I implemented it around an HTTP download point. The one blocking factor is that HTRC doesn't yet officially support the web downloader, and the URL is temporary. @borice, let me know when we have a permanent one.
This has been done. The mashup and volume checker from David are here: http://data.analytics.hathitrust.org/htrc-mashup/VolumeCheck
When the Rsync subprocess is done (#9), it would be nice to initialize volumes that haven't been downloaded yet.
For example:
In the generator, every time a volume is called, the file for the ID can be downloaded to a temporary location, read to memory, and deleted. If HTRC implements an HTTP download, that would be better, as the download can go straight into memory.