For example taking into account full width asian characters, combining accents in latin characters, etc. - it's not entirely trivial because Python module filenames can for example have accents.
This is going to be extremely difficult to handle correctly 100% of the time, without a big performance hit, and this is just a start (plus some testing) for trying to provide a number that may hopefully be more reliable than the string length.
For example taking into account full width asian characters, combining accents in latin characters, etc. - it's not entirely trivial because Python module filenames can for example have accents.
This is going to be extremely difficult to handle correctly 100% of the time, without a big performance hit, and this is just a start (plus some testing) for trying to provide a number that may hopefully be more reliable than the string length.
pytest without this stuff:
pytest with this stuff: