Open nbehrnd opened 2 years ago
Yes, this looks like a bug.
I'm not using compact/short/month patterns myself and therefore I was probably just sloppy when implementing it for the ISO pattern only. Which is definitely not smart. ;-)
Thanks for reporting.
@nbehrnd
I can confirm that the smart-prepend feature was only implemented for YYYY-MM-DD(THH.MM.SS)
format and no other.
The fix might take longer than anticipated. I wrote appendfilename and date2name while I was not aware of named groups for Python regex. Before I fiddle with the old/complex regex, I want to introduce much more flexible regex that are much easier to maintain for the future.
Furthermore, I'm thinking of extracting some functionality from both tools into one library to deal with date- and time-stamps within strings in general.
As a brief sneak preview, here is a snippet from my current brainstorming:
Library
[ ] Name:
input is a string
analyze_timestamp_match( re.match( mystring ) ) -> returns Object:
[ ] alternatively: no object but list/hashtabel and all of those functions below do take that as parameter1
match(bool), century, year, month, day, hours, minutes, seconds, dateformat, timeformat, separator_list
functions:
generate_timestamp(century, year, month, day, hours, minutes, seconds, dateformat, timeformat, separator_list) -> returns string in suitable format
YMD_SEPARATORS = '[-_.]' # potential separator character between the entities of year, month, day
DATETIME_SEPARATORS = '[T: -_]' # potential separator character between the entities of datestamp and timestamp
HMS_SEPARATORS = '[:.-]' # potential separator character between the entities of hour, minute, second
END_SEPARATORS = '[^a-zA-Z0-9]' # potential separator character between the entities of datetimestamp and rest
TIMESTAMP_REGEX = re.compile('^' +
'(?P<overall_datetimestamp>' + # BEGIN: overall_datetimestamp: datetimestamp with separator
'(?P<century>\d{2})?' + # optional century: YY e.g. 20 (from 2022)
'(?P<year>\d{2})' + # YY e.g. 22 (from 2022)
'(?P<ym_sep>' + YMD_SEPARATORS + ')?' + # optional separator character
'(?P<month>[01]\d)' + # MM e.g. 12 (December)
'(?P<md_sep>' + YMD_SEPARATORS + ')?' + # optional separator character
'(?P<day>[0123]\d)' + # DD e.g. 31
'(' + # BEGIN: timestamp is optional
'(?P<datetime_sep>' + DATETIME_SEPARATORS + ')?' + # optional separator character
'(?P<hour>[012]\d)' + # HH e.g. 23
'(?P<hm_sep>' + HMS_SEPARATORS + ')?' + # optional separator character
'(?P<minute>[012345]\d)' + # MM e.g. 59
'(' + # BEGIN: seconds are optional
'(?P<ms_sep>' + HMS_SEPARATORS + ')?' + # optional separator character
'(?P<second>[012345]\d)' + # SS e.g. 59
')?' + # END: seconds are optional
')?' + # END: timestamp is optional
'(?P<end_sep>' + END_SEPARATORS + ')' + # mandatory separator character
')(?P<rest>.*)' # END: overall_datetimestamp: datetimestamp with separator
)
# same regex but in one piece:
TIMESTAMP_REGEX = re.compile('^(?P<overall_datetimestamp>(?P<century>\d{2})?(?P<year>\d{2})(?P<ym_sep>[-_.])?(?P<month>[01]\d)(?P<md_sep>[-_.])?(?P<day>[0123]\d)((?P<datetime_sep>[T: -_])?(?P<hour>[012]\d)(?P<hm_sep>[:.-])?(?P<minute>[012345]\d)((?P<ms_sep>[:.-])?(?P<second>[012345]\d))?)?(?P<end_sep>[^a-zA-Z0-9]))(?P<rest>.*)')
# examples:
# re.match(TIMESTAMP_REGEX, '2022-01-14T17.53.16_foo.bar').groups()
# -> ('2022-01-14T17.53.16_', '20', '22', '-', '01', '-', '14', 'T17.53.16', 'T', '17', '.', '53', '.16', '.', '16', '_', 'foo.bar')
# re.match(TIMESTAMP_REGEX, '2022-01-14T17.53.16_foo.bar').groupdict()
# -> {'hm_sep': '.', 'hour': '17', 'overall_datetimestamp': '2022-01-14T17.53.16_', 'century': '20', 'year': '22', 'day': '14', 'rest': 'foo.bar', 'month': '01', 'end_sep': '_', 'second': '16', 'md_sep': '-', 'ms_sep': '.', 'datetime_sep': 'T', 'ym_sep': '-', 'minute': '53'}
# re.match(TIMESTAMP_REGEX, '2022-01-14T17.53.16_foo.bar').group('hour')
# -> '17'
# re.match(TIMESTAMP_REGEX, '2022-01-14T17.53.16_foo.bar').group('second')
# -> '16'
# re.match(TIMESTAMP_REGEX, '2022-01-14T17.53.16_foo.bar').group('datetime_sep')
# -> 'T'
# The regular expression matches date- and time-stamps as long as order is YMDHM(S).
# For other orders like MM/DD/YYYY: please do re-think your life choices. ;-) *SCNR*
# Unsupported:
# - non-ISO-like orders of the entities
# - time zones or time offsets
# - weeks
# - durations or intervals
# - milliseconds
# Simplified: (YY)?YY.MM.DD.HH.MM(.SS)?
# The separation characters are limited to sets of potential characters (see regex for details).
Link for online testing the regex%3F)%3F(%3FP%3Cend_sep%3E%5B%5Ea-zA-Z0-9%5D))(%3FP%3Crest%3E.*)&test_string=2022-01-14T17.53.16_foo.bar&ignorecase=0&multiline=0&dotall=0&verbose=1)
Compared to this rich manifold, the filter «is it at all one of the patterns issued by date2name» in my testing script
if (re.search("^\d{4}-[012]\d-[0-3]\d_", old_filename) or
re.search('^\d{4}-[012]\d-[0-3]\dT[012]\d\.[0-5]\d\.[0-5]\d_', old_filename) or
re.search("^\d{4}[012]\d[0-3]\d_", old_filename) or
re.search("^\d{4}-[012]\d_", old_filename) or
re.search("^\d{2}[012]\d[0-3]\d_", old_filename)):
# enter the inner loop
appears naïve, because it does not consider a set this large of separators between the decimals. Rather, I speculate your variations anticipate/are a requirement to render appendfilename (maybe already date2name) functional in (Linux .and. MacOS .and. Windows) for time stamps set by date2name, as well as by other time stamp programs.
(And because my current focus is on what GLT18 showcased after presenting filetags, date2name's pattern compact/month/short are currently not used by mine.)
The option
--smart-prepend
aims to keep time stamps (added by date2name) in front of the file name. However, for stamps assigned with either--compact
,--month
, or--short
, the pattern generated differs; here, the text is leading ahead of the time stamp.For example, in a live session of Xubuntu 20.04.2 LTS/Fossa and a pristine checkout of appendfilename/master, these are the observations:
These observations are coherent with the automatic testing with the test script for pytest for Python 3 just extended, e.g., by