Closed hmans closed 9 years ago
As is: fetching is done from within a Document instance.
Document
To be: Move the actual fetching outside of Document. Get a hash of data, then copy that into a new or existing Document instance.
Tasks:
DocumentFetching
Fetch
Document.from_url
Document#fetch
consume_json
#find_original
As is: fetching is done from within a
Document
instance.To be: Move the actual fetching outside of
Document
. Get a hash of data, then copy that into a new or existingDocument
instance.Tasks:
DocumentFetching
intoFetch
and make it return hash dataDocument.from_url
to first invokeFetch
, then decide how to act on it (deduplication etc.)Fetch
for remote hostsDocument#fetch
using Fetchconsume_json
)#find_original
et al)