Closed cbm755 closed 1 year ago
These new docs should @pxref{dir_encoding}
.
Regarding recommending .oct-config
files to downstream: That is probably a good idea in general. But I'm not sure if it would solve all possible issues here. The setting in that file only applies to how .m files are parsed by the interpreter (see documentation of dir_encoding
). Afaict, it doesn't change how files are read with functions like fread
, textread
, ...
I haven't looked into how doctest
gets to the docstrings. But I'd guess that it involves some form of reading those files from the disc in text mode. Afaict, that uses the "global" __mfile_encoding__
by default (not the folder local dir_encoding
).
You could probably try to automate that somewhat in doctest
by querying dir_encoding
on the folder containing the file to be read and setting the respective encoding with fopen
. In that case, the encoding specified in .oct-config
files would also apply to the tested docstrings semi-automatically.
I think in most cases we call builtin get_help_text
.
I just took a quick look and the only time I can see that we do raw fileread
is on texinfo input (like a pure .texinfo
file, maybe we can ignore that case for time being: I can file a new issue). In functions, classes, oct-files etc we use builtin functions.
I made a quick test with an .m file encoded in ISO 8859-1 and an .oct-config
file that contains encoding=iso-8859-1
.
get_help_text
returned non-ASCII characters correctly converted. So, that seems to work correctly already. 🎉
Can you do a quick PR for that? Just put the two files in a subdir of tests
, maybe "test/non_utf8_mfile". I can edit, but I think then "make test" should work.
I opened #256 that adds a test.
Added to main help in 8495ba4481c167ddb75c4d932f0da8f202a24c9c
See #251. I think we maybe need to document things a bit, probably in
help doctest
, probably pointing folks at__mfile_encoding
anddir_encoding
and.oct-config
files....oct-config
files may need some work doc upstream: I have not yet found a reference to that. Perhaps we can add a little bit about encoding to https://docs.octave.org/latest/Function-Files.html