Closed yusefnapora closed 8 years ago
I think that we want to store the original asset hash, at it's original size -- and the client to skip resizing altogether.
If the indexer or any user of the client has size limitations, it should enforce them on their own in a preprocessing step. They don't belong to permanent data in the mediachain, and besides different index implementations may have different size limitations!
Long term:
Agreeing that the Client Writer -> Blockchain Core -> Client Reader
parts of the pipeline should probably pass metadata & media through completely unmodified. There should be a clear stage in the pipeline at which the metadata / media data will be signed by the writing user's signature and can no longer be modified in place, right?
Factors driving us imposing media resizing / encoding restrictions somewhere in the pipeline (though not necessarily here):
BTW: Yup, the Indexer has independent steps for media resizing and re-encoding.
Have collected up some longer notes and ideas that we can hopefully all discuss in a meeting early this week.
@autoencoder I think there are potentially 2 relevant high-level questions:
@yusefnapora should we pull this in?
I actually don't know... If you run with --skip-image-downloads
, then this PR won't apply at all. If you are downloading images, the client will also try to upload them to ipfs (or dynamo, if --diable-ipfs
is used). Putting large original images to dynamo will hit the max size limit, and putting huge images to ipfs may not be what we want either.
That said, it would help the indexer out a lot to have the sha hashes in the records, so it can skip re-downloading. But if we're going with the "radical separation" of indexer from blockchain, where the indexer does its own normalization, etc, then I'm not sure how necessary this is.
Hmm yeah that's a fair point. This is a small, nondestructive changeset though, so I wonder if we can merge it just to have?
yeah, I don't think it'll hurt anything to merge it in 👍
this adds a
hash_sha256
field to asset dictionaries that has the hex-encoded sha256 hash of the content. I didn't call itimage_hash_sha256
because we'll eventually want to support other asset types.Now that I'm about to open this PR though, I realized that we're resizing the
thumbnail
image in the writer client to 1024px max. So if you re-download from http, you may get a different hash. Maybe it's best to skip the resizing? especially since the indexer is going to resize as well. The only downside to keeping the original size is potentially having to put very large images into ipfs.