Kulag / KADR

Fork of AniDB UDP API client adbren. Renames and organizes anime files based on data from anidb.net.
Other
11 stars 3 forks source link

Option to leave a symlink at the old path pointing to the new path #17

Open 2Flwrs opened 11 years ago

2Flwrs commented 11 years ago

I'd like to have an option to leave a symlink at the old path, pointing to the new path, when a file is renamed.

I have read through the parts of the code I think is most impacted by this, and could probably implement it myself. I put it here as an issue for a few reasons. Firstly, since it is your code, I wanted to inform you what I plan to do with it. Secondly, if you like the idea, (and don't like others touching your code?) maybe you could help implement it. Thirdly, if I am to implement this, any pointers or caveats that you may think of are appreciated.

Use Case

The use case I am looking at is that I am using torrent to d/l the files. I'd like to be able to quickly get the files into my Anime file hierarchy. At the same time I want to continue to seed the torrents, which wold work if KADR made a symlink at the old place (where the torrent client is looking for the file) pointing to where the file was moved to.

Of course, this only (or mostly) make sense if the rename-pattern places the target outside of the directory that is searched by KADR.

Modus Operandi

The main change would be to (conditionally) change move($old,$new) in move_file in KADR.pm to:

move($old,$new);
symlink($new,$old);

or (to make it more atomic at the "old" side)

copy($old,$new);
symlink($new,$old . "SOME_UNIQUE_TEXT");
move($old . "SOME_UNIQUE_TEXT",$old);

(of course, there should probably be some error testing code too...)

Other Aspects

For this to work in practice there have to be some change to how the file collection routine handles symlinks. Currently, if I read the code correctly, symlinks are treated like normal files and dirs.

The possible options for symlink behaivour is given below:

Regarding symlinks to files

F1) Treat them as ordinary files (like it is now). This will break a lot of things!

F2) Ignore them. (Don't add them to @out in the children sub in App::KADR::Path::Dir)

F3) Add them to the list of files to process, but always use the pointed-to file when taking actions (such as checking the DB for pre-calculated hash). If the file requires moving, move the pointed-to file to the new path and update the link. Also update the hash DB with the new name.

Option F1 is not an option. Running KADR twice on the same dir (with symlink-creation) would wreak havoc! On the first run the actual file would be moved to the target, and the symlink created. On the second run; the symlink would be treated like a file and be moved over the actual file at the target, and a new symlink created at the source. The result of this would be that the actual data file was deleted, a symlink at the source pointing to the target and a symlink at the target pointing to itself!

Option F2 is "good enough", but to really be useful option F3 should be implemented. However, if F3 is implemented, a command line option to just ignore symlinks to files (F2) should be provided.

Regarding symlinks to dirs

(This does not really have to do with the issue at hand, but it is related to symlinks, so I added it for completeness.)

D1) Treat them as ordinary dirs (like it is now). This will usually not break things. It may cause trouble if there are links within the base dir, causing the same file to have multiple names within the list of files to process.

D2) Ignore them. (Don't add them when recursing.)

Leaving it like it is now (option D1) is not bad at all, however a command line switch to choose to ignore them (option D2) could be implemented for completeness of options. Adding such a switch at the same time as adding it for files would be almost for free since the logic is the same.

Kulag commented 11 years ago

You'd probably be better served setting up a script to hardlink files into your collection as they complete. You can set your torrent client to move anime torrent contents into a folder on the same partition as your collection, then execute the hardlink script.

My fellow developer, Kovensky, does something similar to this. You might want to ask him about it, and why he didn't comment. :P

I'm not sure I see any use case where symlinks would be more appropriate. Perhaps a folder on a separate partition where all your torrents are moved on completion without filtering?

If there is a case for it, I think that while the suggested implementation may appear simple to implement at first, it would end up being more complicated to use and to maintain.

Also, to clear up something: KADR won't overwrite existing files, it'll just complain loudly about something being in the way.

2Flwrs commented 11 years ago

My reason for symlinks is, in part, that my torrent DL and my collection are at different partitions. This is not the whole reason, and in truth I could change my torrent-settings to move DL'd files to the right partition. (However, I had hell setting up my torrent client in the first place... and changing things when I have 800+ torrents regg'ed in the client is not a fun proposal!)


Another reason for symlinks is the transparency. It is far easier to look through the torrent DLs and see which files are moved/copied to my collection if they are symlinks than hardlinks.

If they where haredlinks it would not, at first, be obvious that they are different from non-copied files (you'd have to stat them and get the link count) and there is no easy way to tell if the other copy is in my collection or other-where.

With symlinks, a normal ls (with coloring) reveals which files are moved, and ls -l show you where.


On an other note, I have gone through the code more carefully since I wrote the issue.

I can certainly see that it won't be too easy to implement. But apart from what is already pointed out, I think that the main implementation complication is that file status is cached, since:

linkname="/some/link-to-file";
-f $linkname;    # Will return true
-l $linkname;    # Will return true

$st = stat($linkname);
-f $st;          # Will return true
-l $st;          # Will return false

$st = lstat($linkname);
-f $st;          # Will return false
-l $st;          # Will return true

For now, I think I will try to add the symlink-feature to KADR myself.

A question regarding that: can I use the current KADR AniDB client name/version for my tests, or should I get my own AniDB client name?

I will only touch stuff in the code that has to do with local files. I will not touch anything having to do with the AniDB connection or caching of AniDB data, so it should not generate offending AniDB queries.

Kulag commented 11 years ago

The way I see it, the "kadr" client name applies to the AniDB client library KADR happens to have in the same repository, so anything that uses the client library as-is need not use a different name.