cxam / tagx.io

Tag and share service for YouTube, Vimeo and direct videos.
https://tagx.io/
6 stars 1 forks source link

Feature request: Enable labelling of the video itself and not merely timestamps within it #2

Open JamesB0yd opened 4 years ago

JamesB0yd commented 4 years ago

A specific example:

I think it would be good to have video tags that make it possible to find all videos someone appears in.

YouTube doesn't have credits for every video, or a decent way to search them. Some people would create a playlist on their Youtube channel of everything they've appeared in to replicate this. Big vloggers will just create an IMDb page to list every Youtube video they're in.

Face + voice recognition + video "authority/provenance" + proportion of time someone is on-screen in a video could allow a decent prioritised list of tags to be auto-generated.

Generally:

Tagging the video with:

Video Credits:

Video Characteristics:

Supplementary Metadata:

cxam commented 4 years ago

Labelling the actual video is a great idea and your example highlights a good use case. A minimal implementation of this could be to provide the user tagging the video to optionally provide the metadata you mentioned. The issue with that is dealing with duplicates and which video would be considered the source of truth for the data. Currently, you can have multiple variations of the same video tagged by different users. There is no internal linking between duplicate videos so I see some improvements in this area as a priority. Once the videos are linked, there might be several solutions to handle the metadata, either through merging, selecting a single video to be the source of truth or some sort of governance.

I think having a good set of predefined labels (keys) would make controlling and searching the data a better initial approach. The service would need to be smart about how it deals with the content of the labels (values) to avoid users entering different variations. This could be solved by providing context based label searching but ultimately the quality of the data would be down to the user.

The AI/ML approach to auto-generating tags is something I find pretty interesting. I'm currently involved in this space through other projects and would love to do a simple POC for this service. There might be issues with breaching YouTube terms on that though, something I'll need to look into.

It's definitely a nice to have feature and I want to try and map this out better on how to handle the issues raised above, let me know if you have any further thoughts.

JamesB0yd commented 4 years ago

Duplicate videos: Earliest upload date is perhaps the original content? Admittedly I'm considering independent tags and not copying tags between videos.

Regarding tag quality, I'd expected some crowdsourced tag quality curation: e.g. Upvote-downvote of standard public tags on videos.

JamesB0yd commented 4 years ago

i.e. I imagined the public adding tens/hundreds/thousands of their own labels to videos like: "Produced-by-Casey-Neistat" "Produced-in-Scotland" "Interview" "Starring-Person-ABC" "Published-2020-02-26" "Topic-BakingCakes" "Type-Educational" "Type-CookeryShow"

Then clips within the video might get labelled with more detailed topics: "Argument-against-PolicyX" "Demonstration-of-how-to-crack-an-egg"

Kind of how they might label their photos in order to organise them. You want to know where+when taken, who's in it, etc.

To require people to enter the data is artificially turning this into an IMBb for the rest of the web's video and probably off-putting, even if it's obvious that it's optional.

cxam commented 4 years ago

In your example, do you see any issues with with videos tagged with variations of Produced-by-Casey-Neistat, Made-by-Casey, Created-by-Casey-Neistat? Could the perfect solution be a mix of "bring your own labels" and "predefined labels"? On a video, you might already see labels for Produced, Published, Credits, etc but then you have the option to do your own thing and use Created. I feel giving an optional list of these predefined labels would make the discoverability a bit better. Otherwise, it could just be a dumb label and a search is basically a "string contains string", then let an upvote/downvote system define the actual quality as you've mentioned. Don't get me wrong, the free for all labelling would be a lost easier to implement as the service wouldn't have to worry about the governance.

JamesB0yd commented 4 years ago

Structured / predefined labels are definitely something that could be useful as a separate category of tag. What I had in mind is making the labels public, and the community over time can see what people do with tags and let it evolve. Turning the tags into structured labels at that point could allow the variations to get eliminated through merging of tags and improve the quality of those tags, but processing the text to learn what different tags ultimately have the same meaning could be another approach.

As an aside, are you able to read the user's system language so that each tag can have an associated language, to ensure tags in different languages aren't merged when they're different meanings? Perhaps one day, if this takes off.

On Wed, 26 Feb 2020 at 23:50, cxam notifications@github.com wrote:

In your example, do you see any issues with with videos tagged with variations of Produced-by-Casey-Neistat, Made-by-Casey, Created-by-Casey-Neistat? Could the perfect solution be a mix of "bring your own labels" and "predefined labels"? On a video, you might already see labels for Produced, Published, Credits, etc but then you have the option to do your own thing and use Created. I feel giving an optional list of these predefined labels would make the discoverability a bit better. Otherwise, it could just be a dumb label and a search is basically a "string contains string", then let an upvote/downvote system define the actual quality as you've mentioned. Don't get me wrong, the free for all labelling would be a lost easier to implement as the service wouldn't have to worry about the governance.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/cxam/tagx.io/issues/2?email_source=notifications&email_token=ADZ7W5BELSGADSZRZSFY3OTRE355DA5CNFSM4K3O4CNKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENCK3MY#issuecomment-591703475, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADZ7W5B6C5SXGGWZMOD2TB3RE355DANCNFSM4K3O4CNA .

cxam commented 4 years ago

Yea, I agree on doing some post processing on the tags would be good without restricting the user to fill certain labels.

Regarding the user language, I'm not tracking anything specific to users other than what anonymised Google Analytics collects. I guess this could be an opt-in feature to bring in once there is a proper user account system in place.