Do some research to determine which delta compression library would be best for our first implementation. Take some notes so the decision can be made after a team meeting / online discussion.
Key points to look for: how does it use deduping/delta encoding? Do python libraries/bindings exist for it? Can archives be accessed easily outside of scrapy? Is it available as an easily-installable Linux package?
Also, can you come up with a rough sketch of what an implementation would look like? (i.e., we import the library and simply make this function call, or we would need to write a wrapper for this,etc)
Do some research to determine which delta compression library would be best for our first implementation. Take some notes so the decision can be made after a team meeting / online discussion.
Key points to look for: how does it use deduping/delta encoding? Do python libraries/bindings exist for it? Can archives be accessed easily outside of scrapy? Is it available as an easily-installable Linux package?
Also, can you come up with a rough sketch of what an implementation would look like? (i.e., we import the library and simply make this function call, or we would need to write a wrapper for this,etc)