TASVideos / tasvideos

The code for the live TASVideos website
https://tasvideos.org/
GNU General Public License v3.0
62 stars 29 forks source link

Better descriptions for A/V file links on publication pages #1704

Closed moozooh closed 7 months ago

moozooh commented 11 months ago

Here's another unfortunate artifact carried over from the old site:

2023-09-20 03 55 03 tasvideos org 340e4ba7224f

There's no way to tell the difference between these four links based on these descriptions because they aren't describing anything. I bet half the people looking at this aren't even sure what "A/V" stands for, let alone what this enigmatic Mirror is, considering it isn't mirroring anything anymore. It used to—back when we considered BitTorrent the primary source of video captures—but that was years ago. So this is both confusing and factually incorrect.

Here's what we could (and should) do instead, in the interest of clarity:

2023-09-20 04 07 29 tasvideos org 6e0524d70abe

adelikat commented 11 months ago

We have a solution and it is to put in a display name. Otherwise it uses a fallback. What would suggest the fallback be? We do not have any intelligent way to know what it is

moozooh commented 11 months ago

What's a display name? Can you give an example?

Invariel commented 11 months ago

Your username is "moozooh" but maybe you want the display name to be "hoozoom" for April 1st or something. The username is what is used internally for things, the display name is what is shown in the fora or elsewhere.

moozooh commented 11 months ago

I must've missed something, but how are usernames related to links to video downloads? Which part of the link text are they substituting?

Masterjun3 commented 11 months ago

@Invariel I think moozooh didn't ask about the concept of display names, just how they apply here.

@moozooh We currently already have a display name field for publication URLs. An example would be https://tasvideos.org/3570M where it says "CamHack via Mirror". The display name currently replaces the "A/V file" part.

vadosnaprimer commented 11 months ago

We use it for exclusive extra encodes so there's a way to distinguish them, but having to use them for all files would be a problem. I think we can get away with simply calling them Video encodes in the list header and putting extension as a link replacement text. Some encodes have a suffix that goes after _ at the very end, that can go in parentheses (tho we'd need a list of allowed suffices since sometimes user names are separated by _ too.

https://tasvideos.org/2100M изображение

https://tasvideos.org/2741M изображение

moozooh commented 11 months ago

@Masterjun3 I see. the screenshot I took was from this publication where I can now see it wasn't used, so that resulted in the inadequate fallback adelikat mentioned.

Considering we've had a fixed system of filenames in place since the 00s, a simple regex code should be able to parse for everything we need without much intelligence required of it. I only know a bit of Python, but here's something that's hopefully a working example:

def parse_url(url):
    mkv_found = bool(re.search(r'\.mkv', url))
    mp4_found = bool(re.search(r'512kb\.mp4', url))
    camhack_found = bool(re.search(r'camhack', url))

    if mkv_found and not mp4_found:
        return 'text1' if not camhack_found else 'text3'
    elif mp4_found and not mkv_found:
        return 'text2' if not camhack_found else 'text4'
    elif not mkv_found and not mp4_found:
        raise ValueError('Neither ".mkv" nor "512kb.mp4" found in the URL.')

That's a simple heuristic for the four different types of URLs: MKV, MP4, MKV camhack, MP4 camhack, which automatically gives each of them the names defined in text1, text2, text3, text4.

@vadosnaprimer That's much better because it makes the descriptions different from each other at least, which is good. But it would be even better if we called them by their function (as in the mockup I posted) rather than the technical details only video encode enthusiasts would properly understand. So rather than "MKV (10bit444)", it would be "High quality (MKV)". Consider that users unfamiliar with our encoding policies (realistically, most of them) have no idea whether the "MKV (10bit444)" would be bigger or smaller, or better or worse than the "MP4 (512kb)". This at least gives them some idea on which file they should go for depending on how they're planning to use it or what device to play it on. (Instead of "Streamable" we could also go with "Compatibility" for MP4.)

vadosnaprimer commented 11 months ago

The problem with replacing suffixes with specific words is that the former are not consistent. At first we were marking 444 encodes with the 10bit444 suffix, then we dropped the suffix and 10bit444 became the primary downloadable. MP4 should always be streamable, but its format can also be different.

I think it would be wise to scan all currently linked encode files and check their suffixes, and based on that data decide how to present files that have them.

BTW there's only one archive.org encode now, and it's streamable, downloadable, and compatible.

moozooh commented 11 months ago

I think it would be wise to scan all currently linked encode files and check their suffixes, and based on that data decide how to present files that have them.

I agree with this.

vadosnaprimer commented 8 months ago

I got a full dump thanks to this script by Dacicus, and really 512kb and 10bit444 are the only ones used a lot, everything else is either a name or a featured encode for something like camhack or slower speed. I don't know if showing 512kb and 10bit444 to the user would help in any way, and trying to be more helpful breaks against past inconsistency.