hasgeek / funnel

Website for hasgeek.com
https://hasgeek.com/
GNU Affero General Public License v3.0
47 stars 52 forks source link

Image management #767

Open jace opened 4 years ago

jace commented 4 years ago

Several parts of the website need images. We've had Imgee for a while, but it's been hard to maintain as an independent project. Our options are to (a) merge Imgee into Funnel, similar to what was done with Lastuser, or (b) create a new architecture and port data over.

Imgee has three models, all owned by the Profile model acquired from Lastuser:

This simple scheme has done reasonably well, but also has problems:

  1. The StoredFile model stores and serves unprocessed files (via an AWS S3 bucket). This is a risk factor for privacy (EXIF tags are not stripped), malware (content is not checked) and misuse of service (no limits on storage). Unprocessed files should ideally not be stored at all, or should not be served raw to the public.

  2. Originals are stored once per profile instead of once per file, leading to redundant storage.

  3. Labels are user content, but the requirement for labels is from apps, such as for a gallery of profile pictures. This causes a similar problem to playlists in HGTV, where an invisible type column is used to identify the playlist instead of the user-facing title. Label doesn't have, but needs this. In fact, this app-level use case is more important than user-level labeling.

  4. There is a use case for one-off grouping. This is visible in Telegram's photo sharing feature, where if a group of images is shared in a chat, they appear as an album, with an optional caption. These albums only exist as a message in a chat stream. They can't be accessed independently, and if a new set of images is shared, a new album is created to host them. Albums are ephemeral, disappearing into the scrollback. Labels, in contrast, are forever and always visible upfront in the Imgee gallery.

  5. Thumbnails keep the aspect ratio of the original image. An app's requirement, however, may be for a specific aspect ratio to which the uploaded image must be cropped. Imgee doesn't support this. Cropping creates a derivative image, of which Imgee has no understanding. CSS can be used to automatically position an image within a visible area, but this will not let the user choose the portion to crop, nor will it provide guidance to the user at the time of uploading the image.

  6. Imgee uses a CDN to host scaled images, but the image scaler is an app endpoint (at /embed/file/<file-uuid>?size=<x>x<y>). It returns a redirect to the (pre-)generated thumbnail. It is therefore used directly in <img> tag URLs in many places. Calling it has a non-zero cost, and creates a dependency on long-term stability of the endpoint. Image availability is directly linked to Imgee app uptime.

  7. Animated GIFs should ideally be converted into videos for more efficient use of storage and bandwidth (the so-called GIFV format). However, this requires changing the HTML tag from <img> to <video>, which is not possible in a URL-based API. Several other optimizations are similarly unavailable in a URL-based API.

For these reasons, we should re-enumerate requirements and remap existing data in Imgee to a new architecture, instead of simply merging repositories.

New requirements

  1. An image spec: an app feature can require an image matching given specifications: aspect ratio, bucket of pixel sizes (for responsive use), and maybe constraints on image type (no vectors, no animation, no transparency, etc). Examples include profile icons (cropped circle), project covers (16:9, multiple images allowed) and profile covers (single image, but aspect ratio changes by device, so the user needs to be informed).

  2. The image itself, but stripped of privacy-sensitive information like EXIF tags. It's unclear if storing originals is a good idea.

  3. Anonymous albums (meaning not labeled) used wherever images are used, rendering as a carousel (project covers) or a tiled gallery (chat message) or not being allowed at all (profile icons).

  4. Media gallery, either directly on a profile (for long term storage and recurring use) or in specific contexts like projects. Galleries are clearly useful for users who own the image, but their utility vs risk as public-facing galleries is less clear. We will want to publish reference art (logos, image templates, etc) for third party use, but if they are an underutilized and under-moderated feature, they'll become a dumpyard. Public galleries will need a product spec, not just a tech spec.

  5. A "transformed image", representing the application of an image spec on an uploaded image. Transformations can be simple, like resizing and file format change (Imgee's existing Thumbnail model), or could represent a user-selected portion to crop, creating a derived image. We could simplify this by transforming the image during the upload itself, and storing both as separate images, or discarding the original. This removes the need for tracking derived images, but it makes it harder to change image spec. A change in aspect ratio, for instance, will require re-upload/re-select of images, instead of simply re-transforming the reference image.

  6. Possibly for bandwidth/processing savings: image storage should be deduplicated (using a hash) and de-coupled from image ownership, similar to how the EmailAddress model from #714 treats email addresses as platform data, and auto-discards when refcount drops to zero. If we adopt this, we'll need separate Image and ImageUsage models (or something better named). We will also have to store reference data to support de-duplication. For instance, a hash of the original file even if the original is not stored.

Project covers present additional complications, best understood by seeing how others do it:

  1. On Kickstarter, there is one cover video (the pitch) and one cover image. The video is shown only on the project page, and only during the fundraising process. The cover image is used all other times. Additional images and videos must appear in the body, and there is no user-browsable gallery. However, the technical concept does exist, as updates are shown summarized with thumbnails of the images within.

  2. On Indiegogo, a project can have a cover gallery containing both images and videos. Galleries are ordered. Creators use them to create pitch videos in multiple languages (especially Continental European creators, who tend to use English + local language). Others use them to present a marketing video followed by a technical video. It is unclear if the cover image is the first image in the gallery, or is specifically picked by the creator.

Kickstarter hosts its own video (with a poor quality CDN for users in India). Indiegogo does not, and supports both YouTube and Vimeo-hosted videos. The UX is clearly hindered by this embed (YouTube and Vimeo's overlays are distracting), but playback performance is usually better.

Videos are not part of this spec, being handled separately by the VideoMixin (#572) class, which allows one video per sub-class. It will allow one cover video for the project, and one cover image or cover album can be added separately from this spec.

However, projects already have use cases for multiple videos, coming from different angles:

  1. From physical events, we've needed multiple livestreams for multi-track events. This is currently handled with a hacky JSON field for their URLs.

  2. Some web-only projects add a cover video using the livestream mechanism to make a pitch for the project.

  3. Web-only projects add the YouTube livestream for when it's live. Zoom is not supported, although it's the source channel and the only one allowing interaction. Copy broadcasts on Twitter Periscope, Facebook Live and Instagram are also not supported. YouTube is used as hosting infrastructure here.

  4. Single session projects then replace the livestream with the processed video, once again abusing the livestream feature.

These use cases all suggest that (a) the project cover is prime estate, and (b) any spec for images will have to account for interplay with videos, which currently have only a single-video spec in #572's VideoMixin.

We may have to spec media management rather than image management.

jace commented 4 years ago

Telegram has image and video sharing distinct from file sharing. Images and videos are always processed, but files are sent as originals. However, if the file contains an image, it is thumbnailed and displayed as an image.

jace commented 4 years ago

Multi-part images add an additional complication because they straddle the concepts of image, album and video.

If we spec image, album and video as distinct concepts, we'll have to force transform source files to fit one of these concepts.

The GIF format and Telegram's stickers add a new concept: an animation. Animations are treated as images that start animating when in focus (ie, visible in viewport and window has focus). They are distinct from videos, which cause the operating system's media controls to activate.

jace commented 3 years ago

libvips appears to be a suitable replacement for Pillow and ImageMagick as an image processor. There are Python bindings (pyvips), and performance benchmarks. https://github.com/libvips/libvips/wiki/Speed-and-memory-use

jace commented 3 years ago

Tentative database models:

  1. File: Represents a raw uploaded file. This may or may not be processed immediately after upload (strip EXIF tags, minify SVG, etc), so it is stored with two content hashes: of the original file as uploaded, and of the current contents. The original hash is used for deduplication.

    File models are not linked to a user. Like with the EmailAddress model, they are platform data, and use foreign key ref-counting to track usage. Unused files should be deleted after a timeout period (say 24 hours). The File record may remain even when a file is removed, to track historical data (same as with EmailAddress).

  2. Gallery: Like with Commentset, represents a universal mechanism to add media files anywhere they are needed. ~Galleries may be limited to a single or multiple files, specified as a flag.~

  3. Media: Join model between Gallery and File. Represents a specific rendering of the file (usually a custom crop, for eg with avatars), and maintains an internal representation of pre-rendered thumbnails.

    Models that support media (Project, Proposal/Submission) should have a gallery_id fkey. Ownership flows from this parent model. Gallery, Media and File do not track direct owners, but do track actors, the users who uploaded or created them. Optionally, when only a single file is desired, models can link directly to the Media model. However, Media will then need a nullable gallery_id, which will then make it possible to have orphaned Media rows (as with orphaned Gallery and File rows), making refcounting all the more critical.

~Since the Gallery and Media models are always associated, it may be efficient to have them as a single model with an array structure (Postgres htree or jsonb). However, this may complicate the fkey refcounting File needs.~

jace commented 3 years ago

This spec is problematic:

Media: Represents a specific rendering of the file (usually a custom crop)

It requires a whole file to be uploaded (regardless of size) and a crop to be applied server-side. However, cropping can also happen client-side as modern JavaScript is sufficiently advanced, and removing unwanted content before receiving media is better for user privacy. By doing client side cropping, we also remove the need for a File model, as its primary purpose is deduplication.

In addition, we need at least three other models:

  1. Thumbnail: A render of media, representing the file that will actually be served.

  2. Spec: A specification for what media is required, applied during the upload instead of the render. Typically an aspect ratio and a max size.

  3. Viewport: Rules for rendering the image, implemented on the client side via CSS. Typically used for circular avatars or rounded corners, where the browser applies the viewport. The viewport spec is required separately from CSS rules as viewports may be shown as an overlay during image upload.

jace commented 3 years ago

Optionally consider using NudeNet to classify images for moderators.