Failed to download files with long title

ghost commented 7 months ago

Version

Version:2024.02.02

Your Command


py kemono-dl.py --cookies cookies.txt --links https://kemono.su/fanbox/user/81480/post/6992708 --dirname-pattern "X:\Kemono" --filename-pattern "{username}\{published}_{id}_{username}_{title}_{index}.{ext}" --verbose

Description of bug

Fails to download files with long filenames. The file is stored on a Linux NAS. Other than the above link, downloads of posts with long titles sometimes fail.

How To Reproduce

Downloading files with long filenames to a Linux system

Error messages and tracebacks

debug.log

Additional comments

L4cache commented 7 months ago

Fixed. Please note that your naming pattern will make the index been strip off

ghost commented 7 months ago

Another error message appeared. debug.log

I don't want to change the current filename pattern, so I added a replacement that shortens just the title portion of the filename. I am not familiar with Python or regular expressions, so there must be a better implementation.

helper.py

# clean file name for windows & linux
def clean_file_name(file_name:str):
    if not file_name:
        file_name = '_'
    file_name = re.sub(r'[\x00-\x1f\\/:\"*?<>\|]','_', file_name)

    #shorten title to 60 characters
    #this only works for {published}_{id}_{username}_{title}_{index}, so if changing filename_pattern, change here too
    file_name = re.sub('(\d+_)(\d+_)(.+_)(.{0,60})(.*)(_\d+)',r'\1\2\3\4\6', file_name)

    file_name, file_extension = os.path.splitext(file_name)
    name_limit = 255-len(file_extension)-5
    name_clean = file_name[:name_limit] + file_extension
    while len(name_clean.encode('utf-8','replace')) > 255: # same thing
        name_limit -= 1
        name_clean = file_name[:name_limit] + file_extension
    return name_clean

L4cache commented 7 months ago

You probably can modify the compile_file_name function, use deepcopy (import deepcopy from copy first) and modify the post_variables dict with update example

new_post_variables=deepcopy(post_variables)
new_title={'title': post_variables['title'][:60]}
new_post_variables.update(new_title)
(and change the reference of post_variables to new_post_variables in the rest of the function)

L4cache commented 7 months ago

I can't reproduce "another error", it's probably specific to JP locale? I'll try install a JP version Windows in vm to test...

L4cache commented 7 months ago

Still can't reproduce the "another error"

L4cache / kemono-dl