Open Abraxos opened 4 years ago
It doesn't currently exist, there was some work being done in #127 but that didn't pan out. I really want to approach these sorts of problems in a modular way, I may do more investigation into how this can be done nicely.
I would really love it if this feature also add two things:
Just to clarify, you want to be able to specify some sort of rule system that based on torrent data downloads a file to a certain directory?
In a way yes, I was a bit hasty with my suggestion: long-term it would be nice to have some kind of file (JSON, toml, custom domain) that serves as a way to generate events (since I can't think of any other way that won't grind synapse to a pulp).
In the short term however the most important part would be to have the ability to allow the creation of custom folders that synapse will seed, where the folder name would be based on either metadata, custom label, or time-stamp.
Below is an example of how that file would look like (JSON):
{
// In this case, represent the name of the event/rule, but theres other ways to seperate the events/rules
"saveLinuxIsoFiles": {
reactOn: "tracker", // this would be any form of acceptable metadata tag
seedFlag: True, // this would mean that synapse will seed even if it got a custom path,
saveTo: "/home/user/mySecretLinuxIsoCollectionOnly",
seedSpeed: "5mb",
autolabel: False
}
}
Some other torrent clients can be configured to run a script/command on completion. That would provide a superset of this functionality and be useful for other things. It could just be an arbitrary executable that would get the ID of the relevant torrent through e.g. an environment variable.
Moving to a directory on completion is then a trivial shell script that calls sycli with a move command. A little bit of extra logic could choose the destination directory based on tracker, tags, or anything else.
The basic move to a completed directory might be a common enough use case to be worth its own built-in feature anyway. But there are lots of other use cases that wouldn't make as much sense built-in but could be quite nicely implemented as scripts. Some examples off the top of my head are:
The concept could be extended to events other than completion, with the event type given in another environment variable, for example. It could be argued that at some point you'd be better off with an RPC client, but for simple things a script could save a fair amount of effort and complexity.
Personally I think running a command (linux command) is prone to abuse ugly hacks for functionality as is at least the case with rtorrent.
With scripts wouldn't that essentially mean synapse would require to maintain it's own API to said script (if we're talking about using a general-purpose script language) since otherwise you could write a script to include synapse. It would be more preferable to allow synapse to set the rules of what and what isn't to be allowed,
I'm a bit unsure what the best way to approach this is, but I feel having some sort of configuration option which can execute arbitrary shell commands in response to certain RPC events isn't a bad idea.
Off the top of my head I think something like this could be ok:
[[scripting]]
# RPC criteria
criteria = { field = "torrent.progress", op = "==", value = "1.0" }
# List of fields to pass to the script in argv
fields = [
"name",
"size"
]
command = "echo $1 $2"
This is of course prone to all sorts of dangerous abuse, but I think it's ok to have some mechanism like this which gives users a lot of flexibility to automate workloads without needing to run an additional daemon
command = "echo $1 $2"
I think this is a bad idea because it mixes trusted data (the configured command I want to execute) and potentially tainted data ($1 and $2) into a single string and passes it to a shell, which considers the whole thing trusted and tries to parse it into instructions to execute (possibly confusing untrusted data as instruction).
More reasonable would be something like:
command = ["echo", "some/${name}/path", "${size}"]
then only interpolate the variables and pass it directly to execve/std::process::Command, which forwards it to argv
of the child process without doing any shell-trickery that could be exploited. Another approach (which is probably going to be less popular) is not supporting any variables in command = ["/path/script.sh"]
at all and passing the data through environment variables instead.
With those caveats in mind I'd very much appreciate a way to configure commands to execute on events, in particular:
I agree security is definitely something to think about here. For the bit I wrote above I'm talking about $1 and $2 as referring to argv in bash, not doing any actual substitution by synapse. I wrote that up quickly though and didn't realize that this would require starting a subshell to forward the arguments, so I'd say your proposal is more along the lines of what we should do.
Awesome! Stuff like that is one of my pet-peeves and it'd be great to have a torrent client getting this right, feel free to ping me when necessary! :)
Actually for a first iteration I do think setting environment variables is probably the simplest way to handle this, we can just set something like SY_ARG_0, etc.
I think it'd be good to get feedback from people who utilized this sort of scripting in rtorrent and other clients to see what'd be most useful/convenient here.
Just curious about my main question (since this request kinda went into another direction) about the ability to seed from multiple directories, is that something that can/would be implemented?
I believe this is already implemented. Torrents can be downloaded to arbitrary directories as well as moved between them via RPC calls which update the path
field (and they remain seedable). Is this what you're looking for?
This does sound about what I'm looking for (is there any documentation/command/setting I can try this?), would this though work also with what the original issue (move on completion)?
Ah thanks a lot!
The only worry I have with the script solution is that if you got a lot of different types of files downloaded/downloading, that it'll clog up the script file.
This is why I originally suggest some kind of data format to easier distinguish between different types of files and have synapse automatically handle them, since that way the script file can be used for more unique user situations.
Although I can understand if it's going to complicate the usability of synapse (since you would essentially have 2-3 files to keep track of).
external data that finds its way into script arguments is VERY dangerous for command injection, so it will have to be done correctly and need a very thorough review! I would even say its not possible to do safely in a generic way (see gtfobins for all the surprising ways you can abuse normal binaries to execute arbitrary commands). If there are specific actions like moving files, extracting, making directories, etc. then it should be possible to sanitize input according to the function.
one of the biggest reasons I use synapse is because of rust's memory safety/security guarantees which is very much needed in a service that interacts with untrusted data. I would not trust the directory names and file names specified by the torrent file either.
I expect to be able to safely download any torrent file regardless of client configuration, though obviously the torrent client isn't responsible after the files are downloaded
I think if this feature were added, and users start sharing configurations, and adding their own commands to run, they open themselves up to the kind of problem web browsers have where users are expected not to click "untrusted links"
The command injection concern shouldn't apply here since the "command is a list instead of a string" suggestion fully bypasses any shell parsing. Sanitation (that word isn't properly defined though) shouldn't be necessary since the proposed interface is able to handle arbitrary utf-8 safely (with one caveat mentioned below). The remaining issues that we can't address with that:
Before passing arguments that are fully untrusted one should pass --
as an argument to disable further argument parsing so everything after it is passed as an opaque positional argument, otherwise the executed binary might do something unexpected if the filename starts with -
or --
.
Using regular shell scripting as an example, this might do something unexpected if the argument is user controlled:
./someprog --foo "$1"
This doesn't:
./someprog --foo -- "$1"
Somebody could create a torrent with a name that contains /../
, this is already something that needs to be handled by synapse regardless of the scripting interface, synapse should allow access to the normalized filename in case torrents with slashes in them aren't just rejected as invalid instead.
Good thought, but I still don't believe --
is a perfect solution. Again, it is program dependent, not a general solution. (this is because the string "--" will be given to the program executed to decide what to do with it, it's not a standard for exec
or spawn
or any syscall that spawns processes.
I would be much more comfortable if this functionality were limited to certain executables/actions that way mitigations can be certain and scoped correctly. If you want general scripting, that is what the RPC interface is for, and then the security risk is not assumed by synapse itself.
I would just like to remind @evanrichter & @kpcyrd that issue #198 is probably more suited for this discussion.
IMO it would be better to pass all available metadata as environmental variables, they way it's done in udhcpc (busybox). That way we could only provide the name of the script to run and don't worry about spawning subshells, argument parsing/escaping and other issues that are bound to arise.
As for moving the torrent upon completion, how would synapse know the new location after running the action? If possible, it would be better to have a plugin system that hooks up into RPC mechanism that's already in place, so that communication both ways is possible.
Adding to my comment above: I'm not sure, but it could be possible to read environment variables in synapse after the script finishes running if it exports changed variables. That way when the torrent gets moved by the scripts synapse would know its new location and could continue seeding.
we could only provide the name of the script to run and don't worry about spawning subshells, argument parsing/escaping and other issues that are bound to arise.
I agree this is a way to implement the feature, but it does not actually address the security issue.
it could be possible to read environment variables in synapse after the script finishes running
I think moving files could easily be a built in supported action and be handled much cleaner internally.
Hello,
I'm really liking this, but there is lack of a single feature that is preventing me from using it for all my torrenting. I figured out how to set separate watch directories by simply running multiple instances and combining that with fswatch and in general I really like how this is done as a very nice, self-contained, module. Please keep up the good work.
The feature I would like to recommend is the ability to move torrents to a configured directory upon completion. I apologize if this is already in there, but I couldn't find it in the configuration example or the source code.
Thank you