ZeroQI / Absolute-Series-Scanner

Seasons, absolute mode, Subfolders...
999 stars 155 forks source link

DATE_RX not matching youtube video date #497

Open organic-rust opened 2 weeks ago

organic-rust commented 2 weeks ago

When dates are included in youtube video filenames in a folder with a youtube channel id, the file timestamp is being used for the season year instead of the year in the filename. Dates in my filenames are formatted yyyy-mm-dd

Platform

Operating system and version: Debian 12 Plex version: 1.40.2.8395 Python version 3.11.2 installed on system

Expected Behavior

Season year should be read from filename

Current Behavior

Videos are being put into incorrect season year (due to incorrect timestamps being set by yt-dlp)

Additional information

Replacing "\2" in the DATERX regular expressions with "[-.\/\]" (same as first separator, the backslashes are being removed) fixed the issue for me

ZeroQI commented 2 weeks ago

/2 should repeat the same delimiter, please indicate the filename(s) impacted

organic-rust commented 2 weeks ago

log.txt

Attached the log, maybe that will be more useful as it shows the seasons the files are being put into as well

organic-rust commented 2 weeks ago

I played around with it a little and it looks like there are some unnecessary brackets creating duplicate capture groups, and I think it also needs round brackets around the separator so that it can be referenced? I've tested the following and it seems to be working

DATE_RX = [ cic(ur'(?P<year>19[0-9][0-9]|20[0-3][0-9])([\-\.\/\_])(?P<month>0[1-9]|1[0-2])\2(?P<day>0[1-9]|[12][0-9]|3[01])'), #2024-05-21, 2024/25/31, 2024.05.31 cic(ur'(?P<day>0[1-9]|[12][0-9]|3[01])([ \-\.\/\_])(?P<month>0[1-9]|1[0-2])\2(?P<year>19[0-9][0-9]|20[0-3][0-9])')] #21-05-2024, 21/05/2024, 21.05.2024

If those are necessary, leaving them in and changing \2 to \3 also works. I'm not sure if it makes a difference

ZeroQI commented 2 weeks ago

Date used is the file creation date and the regex doesn't seem to trigger a match... DATERX = 0-9, #2024-05-21, 2024/25/31, 2024.05.31 cic(ur'(?P0[1-9]|[12][0-9]|3[01])([ -.\/\])(?P0[1-9]|1[0-2])\2(?P19[0-9][0-9]|20[0-3][0-9])')] #21-05-2024, 21/05/2024, 21.05.2024

Am bad with regex, but they look good to me, tested on https://regex101.com/

I would like to avoid allowing different separators as it could trigger matches in weird ways

/^(?P19[0-9][0-9]|20[0-3][0-9])([-.\/_])(?P0[1-9]|1[0-2])\2(?P0[1-9]|[12][0-9]|3[01])')$/ // ^ ^ // | +---- match capturing group #2 // +----------------- capturing group #2

cannot see why the separator would be capturing group 3 instead of 2...