ryukinix / mal

MAL: A MyAnimeList Command Line Interface [BROKEN: BLAME MyAnimeList]
https://mal.readthedocs.io
Other
109 stars 9 forks source link

Add torrent sub-command #64

Open ryukinix opened 7 years ago

ryukinix commented 7 years ago

Description

We should be able to get the torrent file or magnet link for the next episode passing a anime_regex exactly how other commands like search, filter, add currently does. Some users recommended to me for default get the anime torrent/magnet-links from torrents hosted into http://nyaa.si/. Maybe this can be possible through the recent PR integration of nyaa.si into torrench, a multi-platform API and CLI written in Python integrating various torrent search engines (as suggested by @datafanatic in previous discussion at #3)

Design choices

The default behaviour for search is to fetch the desired anime based on your MyAnimeList data, so then get the torrent or magnet link of the next episode on your list. That way we can pass short names and not worry about multiple matches (like just typing steins and return Steins;Gate because is that all you have in your watching list (supposition). Remember that all the search process happens through simple regex through your MyAnimeList profile content.

When we found multiple results, as another commands which make a action based a provided anime_regex, we should print the multiple results and ask for the user (via stdin) which anime he meant by a index (likely how torrench does too, but without using tabulate).

Usage

Download the torrent file

Download the torrent file of the next episode of lain which is translated to Serial Experiments Lain based through the mal data so then.

$ mal torrent lain

Get magnet link

Similiar to above, but print to stdout the magnet link for the next episode

$ mal torrent --magnet lain 

Can be implemented a -m flag as alias

Extra

Roadmap for torrent command

ryukinix commented 7 years ago

I'm just thinking what is the best: mal torrent <anime-regex> to download torrent files as default or printing magnet links.

ghost commented 7 years ago

Getting both the magnet and the torrent file are fairly trivial, they're both displayed after the search is made. I'm curious as to how to select the right episode to download.

Also, since different fansubs post their content on nyaa.si, it would be hard to correctly use the --all flag, since there are a plethora of fansubs with different encodes, subtitles/audio languages and so on and so forth. While this could be mitigated by only searching using the "English-translated" filter, that would also mean that users who may wish to search for content in their native languages (i.e. not English) would have trouble without changing the code. I see a couple of ways of solving this:

ryukinix commented 7 years ago

Getting both the magnet and the torrent file are fairly trivial, they're both displayed after the search is made. I'm curious as to how to select the right episode to download.

Yes, the right episode is the hard thing here, because the data is not consistent and the my whole discussion about not implementing the watch and download commands is about that. But the episode number we got by the myanimelist profile of the user. I did not have a good model for doing that query, but in the last case we can just search into torrentch by {title} {episode} and the user can select your preferable option showing the multiple matches (similar how torrentch does). But if we find a deterministic way to fetch the right episode torrent without ask for the user, this will be awesome.

Filter content using the --all flag by size instead of filtering by seeds which is the current default behavior, since packs will pretty much always be bigger than a single episode and select the one with more seeds

Yes, we can combine size & seed filtering to get the pack.

Give priority to more known fansubs and their content.

About give priority to more known fansub can be a problem. Probably we did not have this type of information anywhere, right? Clearly, accessible and able to fetch/parsing. We have? If not, so using that we'll do some type of hard-coding. This is not very nice. I would like to avoid that approach if is possible.

Users configure their preferences: either English-only, Non-English only or RAW.

I agreed about the --all issues and user preferences can be added to config file currently stored on ~/.config/mal/myanimelist.ini. We definitely should be able to store user preferences by language filters.

After I got some time to try implement it and your PR be accepted, I'll check these approachs and see which will be better.

ghost commented 7 years ago

Ok! I will make some changes in torrench in order to get the PR accepted sometime this week. Ping me when you have something.

kutsan commented 7 years ago

It would be also good to search based on specific fansub or provide the regex search manually.

Something like that.

$ mal torrent 'some anime' --fabsub horrible

or

$ mal torrent --regex '\[HorribleSubs\]\sSome\sAnime\s-\s\d\d\s\[1080p\]'
ryukinix commented 7 years ago

The fansub flag can be useful, we can filter that after the match of anime name through the MyAnimeList data. But --regex flag is pretty useless here since all the search behind this package use regex by default (of all non-global commands, like filter, increase and decrease).

ryukinix commented 7 years ago
❯ mal filter steins.+
Matched 5 items:
1: Steins;Gate 0
   Plan to watch at 0/0 episodes with score -  

2: Steins;Gate
   Completed at 24/24 episodes with score 10  

3: Steins;Gate: Oukoubakko no Poriomania
   Completed at 1/1 episodes with score 9  

4: Steins;Gate Movie: Fuka Ryouiki no Déjà vu
   Completed at 1/1 episodes with score 8  

5: Steins;Gate: Kyoukaimenjou no Missing Link - Divide By Zero
   Completed at 1/1 episodes with score 9  
ghost commented 7 years ago

It seems that my PR will be merged either today or quite soon. A couple of changes have been made to the code: now, by default, the magnet link is copied to clipboard after the torrent is selected from the list and the script now interacts with Transmission, which is a torrent client for Linux.

kutsan commented 7 years ago

@datafanatic What you are using to copy to clipboard? Honestly, I don't think copying clipboard by default is a good thing to do. Why just send to stdout and let the user copy it however they like or you can provide --copy option.

ghost commented 7 years ago

@kutsan It's the default behaviour expected from modules in Torrench, so I simply implemented the same way the other modules work. But I agree, security wise it might not be the best approach. Maybe @kryptxy has anything to say about it?

Why just send to stdout and let the user copy it

This is also done, both the torrent URL and the magnet links are printed to stdout and loaded to the client (transmission) if the user so desires.

kutsan commented 7 years ago

@datafanatic Thanks for answer. Looking forward to this feature.

BTW, saw your request at MAL. Seems like we both have same likings.

ghost commented 7 years ago

@kutsan Okay :)

Regarding MAL, yes, indeed. Perhaps we can recommend shows to each other some time. :)

ryukinix commented 7 years ago

Great news, @datafanatic. I hope we can implement this feature of mal as fast we can, but I'll be busy, unfortunately in the next days. Deadlines and exams on college.

ghost commented 7 years ago

No worries, @ryukinix. My PR has been merged today, FYI. Ping me when you start working on it and I'll help with what I can. Good luck with college!

ryukinix commented 7 years ago

Thanks. Nice to hear about it! When I begin to work on that I'll ping you on this thread.

kutsan commented 6 years ago

Hey @ryukinix, I hope you're doing well. I'm just curious how's this project going. It's already been ~three months. I don't want to put pressure on you or something; just out of curiosity. I would like to let you know I'm looking forward to its features for my daily use.

ryukinix commented 6 years ago

@kutsan hello! I've been busy for all this time on college, sorry. My vacation started yesterday! :) I'm looking for trying to implement a few things of this issue on the next days.

Thanks for the ping and sorry for the long time to implement this, my college is eating my time.

EDIT: (three months later)

I've been busy with other personal projects and for now I'm even more busy D,: probably I'll not get time to do this in sooner, but I'm glad that there is a nice interface on torrench (I've using a lot, so I think would be ok add this to the project)

bradenbest commented 6 years ago

Perhaps this should be moved to a separate tool. I can see a pretty nice workflow happening between mal, a theoretical nyaa-browser and rtorrent. I say this because this torrent feature is mighty ambitious, and would probably be best-served in its own program. One that "does one thing and does it well", and can focus entirely on that one task. I say you should have a chat with the devs: https://github.com/nyaadevs/nyaa

A collaborative effort to imbibe nyaa with a fleshed out API and a text-based browser to nab urls from it interactively or non-interactively might yield great results.

After you get that working well, it's not too hard to imagine that you could add a torrent command that would simply outsource the task to nyaa-browser and spit out a magnet link to either stdout or a file, allowing the user to use their favorite torrent software from that point. You could add all this to the config, so that things like nyaa.si, nyaa-browser and the software that actually leeches the torrent can be changed. This will be especially important for when nyaa.si inevitably goes down like nyaa.se did, and gets replaced by nyaa.somethingelse

ryukinix commented 6 years ago

Yes, thinking a little through all that months I would not too happy adding more dependencies to project and this functionality directly. However... would be nice create a nice interface of mal to combine some tools, as you cited. We need discuss this more, since the last big discussion, I think we can do it better.

torrench unfortunatelly have a mass conf stuff that I don't like to obey users to use that and add optionals features seems even more complicated to me... Maybe I'm being too stupid about that, but I really like to keep things more simple, I already think that this tool it's almost (almost) complicated (I think in simplify things a long time, but I'll not do some things probably), so at least not adding complex features it's even more important.

If you see all the history, actually, this feature was even more ambicious than this thread, you can look the main features at #3 #4 (on beginning).

I'm not dropping this feature yet, I'm just thinking about it a little. If we find a nice way to combine tools, I'd prefer that than adding more dependencies. This is a quite different approach than #81 stuff that is relying in more software that no everyone uses it.

I'd like if you provide some possible POC of the combination of user-case to the tools that you described @bradenbest.

bradenbest commented 6 years ago

Proof of concept? I'm pretty sure GNU tar does this with its -z (gzip), -j (bzip2) and -J (xz) flags.

tar zcf compressed.tar.gz <source files>

Looking around google, I can't find any confirmation of what I said, so unfortunately, I'm going to have to dig into the source code to find out. If you've ever read GNU's source code, you know this is going to be painful. They use the "I don't give a fuck" style of programming where functions are hundreds of lines long and frequently nest in vast networks of deep complicated control structures and preprocessor macros. It's a nightmare. Eh, all I'm doing is complaining now. Let's dive in.

Looking in tar's source code src/tar.c:decode_options():

case 'z':
  set_use_compress_program_option ("gzip");
  break;

Okay, let's see what set_use_compress_program_option does:

static void
set_use_compress_program_option (const char *string)
{
    if (use_compress_program_option && strcmp (use_compress_program_option, string) != 0)
      USAGE_ERROR ((0, 0, _("Conflicting compression options")));

    use_compress_program_option = string;
}

God I hate GNU's style (FYI, I've been fixing the broken indentation this whole time, it's actually a lot uglier if you look at the rest of the code. The function I quoted above is actually pretty well-written aside from the broken indentation).

ANYWAYS, so there's a global variable called use_compress_program_option which appears to be a char * type.

.../src $ fgrep use_compress_program_option *
buffer.c:      execlp (use_compress_program_option, use_compress_program_option,
buffer.c:                   use_compress_program_option));
...
common.h:GLOBAL const char *use_compress_program_option;
...

And it is.

So, FINALLY, we get an answer. Yes, tar is literally calling the program gzip via the OS when you pass the -z flag. It's using a libunistd function called execlp, which is in the execvp (exec void pointer) family of functions. These are part of the POSIX standard (POSIX.1-2001 and POSIX.1-2008), and thus any POSIX-compliant OS will have this function in its C standard library. What this function does is invoke a system call to have the OS directly fork and execute the requested program. In other words, it's a layer below the system() function in the same way write() is a layer below fputs()

That said, as that is part of the POSIX C standard library, it should be available in a python wrapper.

And it is.

https://docs.python.org/3/library/os.html#process-management

I can tell you from experience that execvp and its kin are kind of a pain in the ass to use, as the OS is fickle about how you set up the environment, and you're also supposed to wait() on the process id that the call to execvp produces, so that when it exits, you have the return value, and your main program knows when to continue doing its thing. You might also want to try system(str), which takes str and runs it in a shell (the same as if you typed it in a terminal, but in the background).

Now that we've established that this is a real thing used by real programs, I think it's safe to say you have your proof of concept. In the same way, you would execvp or system your command so that it is effectively outsourced. There are also functions, by the way, like popen() (pipe open), which give a file pointer that lets you receive the stdout output of the program being run. The python analogue of that seems to be subprocess.run

https://docs.python.org/3/library/subprocess.html

Alright. Time for me to hit the hay.

bradenbest commented 6 years ago

@ryukinix

That said, I figure you probably wanted a demonstration, so...

https://gist.github.com/bradenbest/a3c531307a2f76e690901522c628b6b0

The only reason this took me nine hours is because 90% of it was sleeping and 10% of it was writing code. And 90% of that 10% was spent fighting with python over weird little errors and unexpected/tricky-to-predict behavior, because I'm not all that used to python's standard library, or substrings. For example, what does "asdf"[2:1] come up with? How would I get "dfg" out of "asdfghjkl"? The answers are "" and "asdfghjkl"[2:5] or "asdfghjkl"[2:2+3], respectively.

Combine that with trying to match multi-character sequences and splice out a substring, and that's a good half hour trying to figure out why strbetween("asdfghjkl", "batman", "kl") returns "sdfghj" instead of "asdfghj" (fuck me for wanting fault tolerance, right?).

Sorry for ranting. I'm just annoyed that it took an hour to write instead of 15 minutes.