Open mfsysprog opened 1 year ago
This patch changed behaviour from interpreting untagged files as ibm-1047 into a heuristic approach to decide if the untagged file contains ebcdic or ascii (which is fine), but it will allways write out the result as ascii (which is undesirable).
This issue was opened to further look into a correct approach: https://github.com/ZOSOpenTools/meta/issues/387
There's actually a heuristic in place that detects the underlying encoding for untagged files, in here: https://github.com/ibmruntimes/zoslib/blob/cc10b7c1d6211a2c28b10b540bf406c7148fbf4f/src/zos-char-util.cc#L490 . If it's detected to be an ebcdic 1047, then it will auto-convert to ASCII 819 (the program ccsid). However, sometimes we don't want auto-conversion, in this case we call disableautocvt(fd) in the program (git does this in several cases). You can also control the behaviour via an environment variable: https://github.com/ibmruntimes/zoslib/blob/zopen/src/zos.cc#L951 , UNTAGGED_READ_MODE, setting it to STRICT should disable the conversion.
At this point in time filetags are honoured. Untagged files will always be written as ascii (except when the default for new files is changed). Still working on making a fix for this. Also, the -B option to create a backup will most likely always create an ascii file (not tested yet)
When I originally ported nano I though when testing that it handled untagged files and files tagged as ibm-1047 correctly. But doing some testing on our systems with the current release it seems that nano actually will always write out ascii. This has the potential to mangle untagged files or ibm-1047 tagged files.
I have to see when I can find time to look into this some more, but for now I thought I'd open a ticket to see if somebody can replicate my findings or that it is perhaps some environment variable that changed behaviour somewhere. For now I would recommend not using nano until this is fixed.