Previously, we were using the dataset revision in the URLs of image/audio files of cached responses of /first-rows.
However when a dataset gets its README updated, we update the dataset_git_revision of the cache entries and the location of the image/audio files on S3 but we don't modify the revision in the URLs in the cached response.
This resulted in the Viewer not showing the images after modifying the readme of a dataset.
I fixed that for future datasets by not using the revision in the URLs anymore and use a placeholder that is replaced by dataset_git_revision when the cached response is accessed
Implementation details
I modified the URL Signer logic to also insert the revision in the URL and renamed it to a URL Preparator. It takes care of inserting the revision and signing the URLs.
Previously, we were using the dataset revision in the URLs of image/audio files of cached responses of /first-rows. However when a dataset gets its README updated, we update the
dataset_git_revision
of the cache entries and the location of the image/audio files on S3 but we don't modify the revision in the URLs in the cached response. This resulted in the Viewer not showing the images after modifying the readme of a dataset.I fixed that for future datasets by not using the revision in the URLs anymore and use a placeholder that is replaced by
dataset_git_revision
when the cached response is accessedImplementation details
I modified the URL Signer logic to also insert the revision in the URL and renamed it to a URL Preparator. It takes care of inserting the revision and signing the URLs.
close https://github.com/huggingface/dataset-viewer/issues/2965