Closed NHOrus closed 4 years ago
Ah yeah, the attempted filename/path length came out to 403 characters in that, where the limit is 255 on all common modern filesystems.
Same thing happened on a booru downloader I used to follow, if someone used a TAG variable and there were like 20-30 tags for the image. They implemented a hard character limit on all filenames.
So probably something like 250 (for buffer) - <root dir length> - <file ext> = max output length of the 3 filename format variables
would solve this, as well as others that could come up without using %title%.
Four filename extension characters and one for separator.10 окт. 2019 г. 3:05 ПП пользователь photon notifications@github.com написал:Ah yeah, the attempted filename/path length came out to 403 characters in that, where the limit is 255 on all common modern filesystems. Same thing happened on a booru downloader I used to follow, if someone used a TAG variable and there were like 20-30 tags for the image. They implemented a hard character limit on all filenames. So probably something like 250 (for buffer) - = max output length of the 3 filename format variables would solve this, as well as others that could come up without using %title%.
—You are receiving this because you authored the thread.Reply to this email directly, view it on GitHub, or unsubscribe.
Four filename extension characters and one for separator
Sure, meant the whole path+filename+ext as the "output" in my example. Not sure what you mean by 1 char for separator; filenames might have different numbers of separations (e.g. space) depending on the number of format variables, and in the case of %tags% would depend on the number of tags in the image....so I assume that is already done on the fly to some extent and a hard trancate to x if filename length > 250 or IOError36 would be the easiest sort of thing to do. I doubt it comes up often enough to make it worth doing conditionals on what tags to include if errored.
But of course Nandaka will know the semantic details of filename/variable interaction much better than me x3 I just was giving this a bump with basic thoughts because I was testing a related filename error ^^
Ah, I misunderstood. Either way, there need to cut "The templader name" - path -.jpeg On linux NAME_MAX is 255, PATH_MAX is 4096 On Windows, it's 260 characters
Except it's single byte on Linux and Unicode symbol on Windows, so it takes a bit more bits for same name in Linux than in Windows.
It's a mess, honestly.
It is already cut the filename to 255 in https://github.com/Nandaka/PixivUtil2/blob/9db6153e624e76143b188a83685b6321a23b5327/PixivHelper.py#L113
/home/nho/adata/pixiv/39182623/446430_p3_O0x7ASU437evHkT61U7YsVW5 - 4枚に+12枚=16カット。(カバーでは起きてますが行為中は起きません:ボテ絵)指で局部広げ・挿入に3カット・抽挿に4カット・射精に3カット・ペニス引き抜き溢れ精液に3カット・ボテ1カット になります。(+下書き一枚を挟んで文字なしver.).png
Should be counted as 193 chars, right? unless in linux, it is counted as double width chars for the kanji/kana.
Related call https://github.com/Nandaka/PixivUtil2/blob/master/PixivUtil2.py#L1906 https://github.com/Nandaka/PixivUtil2/blob/9db6153e624e76143b188a83685b6321a23b5327/PixivHelper.py#L71
FYI:
For Windows, usually, 255 CHARS is maximum for FULLPATH. (There's way to expand limitation but not commonly used https://docs.python.org/3/using/windows.html#removing-the-max-path-limitation )
For Linux, 255 BYTES is the maximum for FILENAME. (Usually, UTF-8 is used for encoding, most Japanese chars are 3bytes but there are exceptions.) So if a directory path is long, Linux may be able to save longer FILENAME compared to Windows.
So if a directory path is long, Linux may be able to save longer FILENAME compared to Windows.
shouldn't be the other way around if linux limitation is based on bytes? e.g. assuming worst case scenario (3bytes per character), then the max filename character will be 255/3, isn't it?
In worst case, there's char that represented by 4bytes but rarely. (a few kanjis like 𠮷 and emojis ☺) .
I come up with this code (didn't tested yet).
if platform.system() == 'Linux':
# Linux: cut filename <= 255 bytes
dirname, basename = os.path.split(name)
while len(basename.encode('utf-8')) > 255:
filename, extname = os.path.splitext(basename)
filename[:len(filename) - 1]
basename = filename + extname
name = dirname + os.sep + basename
else:
# cut path to 255 char
if len(name) > 255:
newLen = 250
name = name[:newLen]
Prerequisites
Description
On linux, trying to download work with
Files from fanbox are saved into pixivutil folder instead of artist folder
Artist put excessively long title:
I expect that file gets saved into correct folder, possibly without title or with trunkated filename.
Versions
Current git, reported as 20190907b