zopencommunity / perlport

Perl programming language
Apache License 2.0
6 stars 3 forks source link

Reading an untagged file <= 8 bytes in size causes output encoding differences #84

Open chrishodgins opened 7 months ago

chrishodgins commented 7 months ago

With the following perl program the output will appear corrupted unless the file is greater than 8 bytes in size. The file untagged-file-with-ebcdic.txt is untagged and only contains EBCDIC characters.

Perl test program:

open(my $fh, '<', 'untagged-file-with-ebcdic.txt');
while (my $row = <$fh>) {
    chomp $row;
    print "$row\n";
}
close($fh);

Shell example:

$ chtag -r untagged-file-with-ebcdic.txt
$ od -Ax -xc untagged-file-with-ebcdic.txt
0000000000      F1F2    F3F4    F5F6    F715
               1   2   3   4   5   6   7  \n
0000000008
$ perl test.pl
�������

### Now try again with slightly bigger contents
$ od -Ax -xc untagged-file-with-ebcdic.txt
0000000000      F1F2    F3F4    F5F6    F7F8    1500
               1   2   3   4   5   6   7   8  \n
0000000009
$ perl test.pl 
12345678

Repeating the same sequence with the file tagged as IBM-1047:

$ chtag -r untagged-file-with-ebcdic.txt
$ od -Ax -xc untagged-file-with-ebcdic.txt
0000000000      F1F2    F3F4    F5F6    F715
               1   2   3   4   5   6   7  \n
0000000008
$ perl test.pl
1234567

### Now try again with slightly bigger contents
$ od -Ax -xc untagged-file-with-ebcdic.txt
0000000000      F1F2    F3F4    F5F6    F7F8    1500
               1   2   3   4   5   6   7   8  \n
0000000009
$ perl test.pl 
12345678
IgorTodorovskiIBM commented 7 months ago

Thanks, this is similar to the issue raised for nano. https://github.com/ZOSOpenTools/nanoport/issues/11

Most zopen tools leverage zoslib, which is the culprit here. You can try using the environment variable mentioned in the nano issue until we have a proper fix.

chrishodgins commented 7 months ago

@IgorTodorovskiIBM thanks, I can confirm that setting export __UNTAGGED_READ_MODE=ASCII did resolve the issue for the moment.