gnu-octave / octave-doctest

Doctests for Octave/Matlab
https://gnu-octave.github.io/packages/doctest/
BSD 3-Clause "New" or "Revised" License
16 stars 4 forks source link

Add test with a file that is not encoded in UTF-8 #256

Closed mmuetzel closed 2 years ago

mmuetzel commented 2 years ago

See also #253 and #254.

make test is passing for me locally. But actually, the file isn't parsed correctly in that test and warnings appear in the log. The latter is likely due to an issue in upstream Octave. When I cd to the directory containing that file, it isn't decoded correctly:

>> cd ('test_encoding')
>> enc = dir_encoding ('.')
enc = windows-1252
>> a = get_help_text ('test_CP1252')

warning: Invalid UTF-8 byte sequences have been replaced.
a =
 @deftypefn {} {} test_CP1252 ()
 Test function with some characters from CP1252

 ������ ����

>> clear all  % to make sure functions are re-parsed

>> a = get_help_text ('test_CP1252')

warning: Invalid UTF-8 byte sequences have been replaced.
a =
 @deftypefn {} {} test_CP1252 ()
 Test function with some characters from CP1252

 ������ ����

However, if I change to a different directory and add the path to that function to the load path, it works:

>> cd ..

>> addpath (canonicalize_file_name ('test_encoding'))

>> clear all  % to make sure functions are re-parsed

>> a = get_help_text ('test_CP1252')

a =
 @deftypefn {} {} test_CP1252 ()
 Test function with some characters from CP1252

 ÄÖÜäöü ŠŽšž

It looks like functions in the current folder aren't parsed with the encoding specified in the .oct-config file...

mmuetzel commented 2 years ago

I opened a bug report upstream here: https://savannah.gnu.org/bugs/index.php?62761

cbm755 commented 2 years ago

I tried putting something like this so that the function has a doctest:

diff --git a/test/test_encoding/test_CP1252.m b/test/test_encoding/test_CP1252.m
index 9ed8d39..a633d95 100644
--- a/test/test_encoding/test_CP1252.m
+++ b/test/test_encoding/test_CP1252.m
@@ -34,6 +34,10 @@
 %%
 %% <C4><D6><DC><E4><F6><FC> <8A><8E><9A><9E>
 %%
+%% @example
+%% s = 'thx Markus M<FC>tzel'
+%%   @result{} s = thx Markus M<FC>tzel
+%% @end example
 %% @end deftypefn

(That isn't likely to display properly but hopefully you get the idea).

cbm755 commented 2 years ago

Now I'm uncertain whether to merge this: its a bit noisy! And is going to be on Octave <= 7.1.0 for a long time...

Perhaps I can try to implement this as a BIST, where we can test it only when Octave > 7.1.0 or something like that. That might also give us the flexibility to test with and without your "add_path" workaround... Have to think about how to do this without another top-level directory...

cbm755 commented 2 years ago

Incidentally, I've not yet found found a tool that can reliably open that file!

Is this really a "fair" input?

cbm755 commented 2 years ago

If I check "Change encoding" and then select "Windows 1252" it does display correctly in Octave's built-in Qt editor... Shouldn't that UI also be respecting the .oct-config file?

mmuetzel commented 2 years ago

Shouldn't that UI also be respecting the .oct-config file?

That sounds like a new feature that could improve the user experience. Could you please open a bug report for that on Savannah?

mmuetzel commented 2 years ago

I added an example to that file like you suggested.

mmuetzel commented 2 years ago

Also moved the new test to a separate folder that allows executing dedicated commands for it. We could probably also limit running the test to specific Octave versions...

Edit: I'm not sure the structure of that test is correct though. Does the try-catch-block make sense to you? Or should the test be run differently?

Edit 2: I tried with a deliberately wrong test: The try-catch-block didn't make sense. Adapted the code accordingly.

mmuetzel commented 2 years ago

Shouldn't that UI also be respecting the .oct-config file?

That sounds like a new feature that could improve the user experience. Could you please open a bug report for that on Savannah?

I opened a report for upstream here: https://savannah.gnu.org/bugs/index.php?62767

cbm755 commented 2 years ago

Or should the test be run differently?

I have not thought carefully about this. I have a dim plan to merge this with the %! BIST tests, currently executed by make test-bist target, but haven't tried it yet.

cbm755 commented 2 years ago

BTW, I finally got my local hg clone building again, and your patch works great on hg tip---thanks!

cbm755 commented 2 years ago

I made your "test_extra" into a BIST. I may consider moving test/bist.m to this new directory, and then having a clear "doctest vs test" split. But that would be another MR.

Will MWPS but further edits here are certainly welcome!