Problems with non-ASCII Unicode and binary stdin/stdout/stderror in Python 3

jehturner commented 2 years ago

Update: this was the first of several related problems discussed here; see below.

It turns out that at least one of our CL scripts has a special character (non-breaking space) in a comment for some reason, which PyRAF happily treats as part of the comment under Python 2, while Python 3 gives the following error when trying to read the script (before PyRAF even gets as far as parsing it):

UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 7931: invalid start byte

This can be reproduced trivially with a 2-line Python script:

fh = open('/home/jturner/repos/gemini/gmos/mostools/gmskcreate.cl')
text = fh.read()

I wonder whether utf-8 is actually the best assumption for CL scripts? This one is probably latin-1? Anyway, I'll remove the offending character from our script, but I just thought this might be worth mentioning in case other people report seeing the above error (especially when just running lpar) and I thought I'd put it in an issue so they can find it in a Web search. Feel free to close this without doing anything if you don't think there's a better default encoding specification.

olebole commented 2 years ago

Pythons open() function defaults to the platform default for encoding, not strictly to UTF-8. I a bit hesitate to overwrite this with latin-1, since any new file would probably written with the platform default, whatever this is. However, IRAF itself is not8 bit clean. 2.16.1 and older may just crash with non-ASCII chars, newer silently ignore them:

ecl> imcopy dev$pix bla1ä2ö3ü4.fits
dev$pix -> bla1ä2ö3ü4.fits
ecl> ls
bla1234.fits

I would therefore propose to use the "ascii" encoding, and ignore any encoding error:

>>> 'bla1ä2ö3ü4'.encode('latin1').decode('ascii', errors='ignore')
'bla1234'
>>> 'bla1ä2ö3ü4'.encode('utf-8').decode('ascii', errors='ignore')
'bla1234'

As far as I checked, this is is required in cl2py.py https://github.com/iraf-community/pyraf/blob/7c63d6a2a569afdeefee1622ba4c24ef97daaea4/pyraf/cl2py.py#L93-L95 and when reading the file for possible tracebacks in irafecl.py https://github.com/iraf-community/pyraf/blob/7c63d6a2a569afdeefee1622ba4c24ef97daaea4/pyraf/irafecl.py#L388-L391 https://github.com/iraf-community/pyraf/blob/7c63d6a2a569afdeefee1622ba4c24ef97daaea4/pyraf/irafecl.py#L403-L404

I am not sure about the other places where a non-binary open() happens:

graphcap.py:        lines = open(self.filename).readlines()
irafpar.py:                    self.fh = open(iraf.Expand(self.value))
irafpar.py:            supfile = open(supfname)
irafpar.py:    fh = open(os.path.expanduser(filename))
iraffunctions.py:    lines = open(file).readlines()
filecache.py:            fh = open(filename)

On most, it would probably not hurt to also use the encoding='ascii', errors='ignore' arguments, right?

jehturner commented 2 years ago

Sorry, this is happening at the read() in MD5Cache in filecache.py. ~Let me just check where the file gets opened, but it's probably where you wrote above.~

I haven't thought about these issues at much length, but yes, ascii with no errors makes a lot of sense off hand (and should make the code work with the current public version of Gemini IRAF, which is good).

jehturner commented 2 years ago

Sorry, the relevant open() is the one in FileCache._getFileHandle(), which is called on the very same line as the read() in MD5Cache (though that method can also take an existing file handle). I suppose this is just checking whether the CL cache is up to date.

When that one is fixed by adding encoding='ascii', errors='ignore', then it goes on to fail at the next open(), in cl2py.py. When the one in cl2py is fixed as well, the lpar succeeds (and it looks like the CL script would work if I had the right inputs lying around; will verify that in due course)...

jehturner commented 2 years ago

The problem is that FileCache also appears to be used for other things than CL scripts, such as the GraphCap example you found, so we'll need to check whether encoding='ascii' is always appropriate, though it's hard to see how it could be worse than not specifying any encoding.

olebole commented 2 years ago

IMO GraphCap is not a big problem as this is almost never touched, and it is already ASCII. I also would not assume that anyone used GraphCap with non-ASCII beside of comments, and I am not sure what IRAF itself would do here. BTW, this would be my proposed test code:

@pytest.mark.parametrize('ecl_flag', [False, True])
@pytest.mark.parametrize('encoding', ['utf-8', 'iso-8859-1'])
@pytest.mark.parametrize('code,expected', [
    ('print AA #  Ångström\n', 'AA\n'),
    ])
def test_clfile(tmpdir, ecl_flag, encoding, code, expected):
    # Check proper reading of CL files with different encodings
    with use_ecl(ecl_flag):
        fname = tmpdir / 'cltestfile.cl'
        stdout = io.StringIO()
        with fname.open("w", encoding=encoding) as fp:
            fp.write(code)
        iraf.task(xyz=str(fname))
        iraf.xyz(StdoutAppend=stdout)
        assert stdout.getvalue() == expected

Here, we actually need a tmpdir parameter, since we want to write a temporary CL file.

When using "ascii" for the encoding in open(), there is however one difference to IRAF (and earlied PyRAF). The CL file

print Ångström

now prints an ngstrm, because the non-ascii chars are ignored. IRAF and old PyRAF print it verbatim (i.e. correctly if the encoding of the file matches the terminal encoding, and with garbage if not). I don't see a good way around this.

jehturner commented 2 years ago

That sounds good and I'll take the opportunity to re-test the changes, which should give us a good idea whether there are unexpected consequences. I was just scratching my head over a few things though...

I agree about GraphCap (probably ParCache too), but there is also irafexecute._ProcessProxy(), which apparently uses FileCache to deal with binaries. I think that should be OK, because _ProcessProxy doesn't inherit its parent's updateValue, so never calls _getFileHandle -- but it seems to indicate that FileCache isn't necessarily just for text files (in theory), in spite of the code in the parent class.

Also, I'm wondering about irafpar.IrafParL, which seems to be the implementation of "list structured parameters" such as cl.list or *struct parameters in CL scripts (including a subclass, IrafParCursor, for cursor data). These constructs are used to read lines from an input file, which I suppose does imply text, but is it safe to assume plain ASCII (as in L309 of your PR) when dealing with fairly arbitrary data rather than code??

In addition to your PR changes, is it OK that iraffunctions.redirProcess() opens files for I/O redirection as text with an unspecified encoding (L3756)? I suppose you won't want to skip characters in such a case and consistently assuming the platform default encoding may be the best you can do (thinking out loud).

We do have a script author with an "ø" in her name... but it would probably only ever appear in comments, if at all (I think she uses Latin "o" in this context)!

olebole commented 2 years ago

With respect to a use of FileCache to retrieve non-text files, you are right. However, I think this can be greatly simplified, which then avoids the problem: FileCache comes with a "base" implementation that caches the file content. This is however always overwritten in the child classes, and FileCache is never used directly; we can just remove this "base" implementation. MD5Cache uses in its implementation the reading of the base class the _getFileHandle() method of the base class; this is the only place where this method is used. So, we could move it to MD5Cache, but we could also just pragmatically replace it with a simple binary open() function (and properly handle it in a with clause). I added this to the PR as commit a3317991931d5cb9a4df4ed617c4c1157e43eabb; maybe you can have a look?

For IrafParL (and maybe even the others) -- maybe I should revert my thoughts partially and just as an errors="ignore" everywhere? Then, we handle non-ASCII at least a bit, f.e. they would work properly if a file was written in and then read in in the platform's default encoding. The difference to Python2 and IRAF CL would be that instead of showing garbage for encoding errors, we would just show nothing.

In iraffunctions.redirProcess() I should also and an errors="ignore", you are right.

jehturner commented 2 years ago

That tidying up of FileCache looks reasonable to me, though _getAttributes seems to contemplate a case where self.filename is an existing file handle and I'm not sure whether that would actually work in updateValue; for example, I see something like this even if the access modes are made consistent:

--> open(open("teststr.py"), mode="rb")
Traceback (innermost last):
  File "<console>", line 1, in <module>
TypeError: expected str, bytes or os.PathLike object, not _io.TextIOWrapper

The way we tend to use list-structured parameters is for reading lines from a temporary file that has just been generated within the same CL script (or sometimes from a look-up table distributed with our package), so I hope that will limit the scope for inconsistencies, but I suppose it will depend on IRAF and not just Python. Those lines are likely to be plain text in practice (eg. filenames or co-ordinates), but that's probably not guaranteed. Another common use would be for reading a list of input filenames provided by the user.

I'm having trouble thinking what's the best way to deal with file redirection and list files when IRAF's assumptions differ from Python's. I suppose you don't know whether you're getting UTF-8 or ISO-8859 or even some encoding for a device from 1970?? It's still not entirely clear to me how everything fits together or where it's likely to fail in practice, so I'm a bit worried about breaking things, but I suppose that skipping characters can't be worse than getting an exception in most cases (as long as it's not silently corrupting something critical).

jehturner commented 2 years ago

The next failure that I started looking into produces a TypeError when piping input into a foreign task under Python 3, so that might be instructive...

olebole commented 2 years ago

For the _getAttributes, I checked that this was always used with a string, never with an opened file. The portion that assumed an opened file was there from the beginning (~2003, ea2716c), and there it was just copied from clcache.py resp. earlier from cl2py.py, where it may have had a use. To me, the other places look pretty clear: in graphcap.py it opens dev$graphcap, in irafpar.py it opens a parameter file. For both it should be safe to assume text -- IRAF itself handles text only as ASCII. in one in iraffunctions.py opens the startup file. The place you probably mean https://github.com/iraf-community/pyraf/blob/d3e9b876dcf94bde05b590d1a5935abda4919382/pyraf/iraffunctions.py#L3708-L3716 https://github.com/iraf-community/pyraf/blob/d3e9b876dcf94bde05b590d1a5935abda4919382/pyraf/iraffunctions.py#L3756 goes back to 1999 (aa363142). While this may in theory somehow dangerous: a IRAF task could write a binary to stdout, changing this is probably quite complicated (one would need to check all places where stdin/stdout is used), and IRAF is primarily meant as interactive tool where stdout is human readable (--> ASCII). That's why I would rather like to keep this unless we really run into problems here. Foreign tasks may show this however, since they are usually not thought to be used within IRAF.

Do you think the TypeError is connected to the encoding stuff? Or is this extra?

jehturner commented 2 years ago

I suspect that the "TypeError: a bytes-like object is required, not 'str'" is related to this (and maybe also write() argument must be str, not bytes elsewhere), but I haven't finished tracking it down yet.

olebole commented 2 years ago

I digged a bit into the problem with stdin/stdout: A major problem is that Python assumes them to be opened in text mode, i.e. when read they return a string. So, it does not allow to pipe a binary file, as seen in the following demonstration:

$ cat stdin100.py 
import sys
s = sys.stdin.read(100)
print(s)
$ python3 stdin100.py < /dev/urandom 
Traceback (most recent call last):
  File "/home/oles/Projects/2015/upstream/iraf/pyraf/stdin100.py", line 2, in <module>
    s = sys.stdin.read(100)
  File "/usr/lib/python3.9/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xf9 in position 2: invalid start byte

I don't see a good way out of this: If we open our redirected stdin/stdout as binary, the file mode would depend on whether it was redirected or not. If we globally set this, it may confuse people who use PyRAF as a library in a bigger program (and therefore may rely on an standard stdin/stdout). So, if this causes real problems, I would propose to make a separate issue out of it, since this would probably require some larger changes (we must identify all relevant read/write places and replace them with stdin.buffer.read(…) and similar).

The TypeError would be however probably useful to handle here. That is probably a missing encode()/decode().

jehturner commented 2 years ago

Sorry, had a bit of an onslaught of odds & ends earlier today. Yes, that's the kind of thing I was worried about, though I was thinking IRAF may be more likely to produce "binary" (really some sort of obscure new-line-separated encoding) via a list-structured parameter rather than stdout? My impression is that it's considered good practice to pipe strict ASCII through stdout, unless it's intrinsically a binary stream being manipulated. The great majority of IRAF tasks send their main output to a named file rather than the terminal. There's the issue of IRAF spewing out terminal control codes, but I think PyRAF mostly avoids that by sending plotting data to the Tk window instead? In the other direction, one thing we have run into is UTF-8 special characters in a log file, eg. from the date command in a non-English locale, which IRAF SPP functions then choke on. Anyway, I'm churning out semi-informed speculation, but given the test failure modes that I've seen, I'm thinking that whatever PyRAF currently does is not too broken in practice... I agree about opening a separate issue for anything non-trivial, rather than making rushed changes. I'll finish checking on the TypeError.

olebole commented 2 years ago

I now added an attempt to resolve the passing of binary files through pipes, 59e16ea44e7ab165f867f1058f7edab929a816df. This handles the following cases (on PyRAF command line)

--> head  /dev/urandom 
���(!~y�
--> head < /dev/urandom 
���*Wg��S�L�c����}
--> head /dev/urandom > some_randoms.dat

Without this commit, these three cases return an

UnicodeDecodeError: 'ascii' codec can't decode byte 0x91 in position 1: ordinal not in range(128).

I am a bit worrying that this may destroy something since it is deep in the IrafProcess class; the tests pass however. Maybe this is worth to try with your cases?

Piping however doesn't work then properly, since this is done via StringIO and not via TextIOWrapper (so it does not have a buffer to write plain bytes). https://github.com/iraf-community/pyraf/blob/d3e9b876dcf94bde05b590d1a5935abda4919382/pyraf/iraffunctions.py#L3769-L3776 https://github.com/iraf-community/pyraf/blob/59e16ea44e7ab165f867f1058f7edab929a816df/pyraf/irafexecute.py#L821-L824

--> head /dev/urandom | head
Traceback (innermost last):
  File "<CL script CL1>", line 1, in <module>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc9 in position 0: invalid continuation byte

Attempting to fix this by replacing _io.StringIO() with _io.TextIOWrapper(_io.BytesIO()) in iraffunctions.py does not help, since the content is really handled as tring in redirReset() and then returned to a number of places (which expect a string and not bytes).

jehturner commented 2 years ago

Argh, of course, they're actually short ints in IRAF, just to make things even more fun. It's amazing that all this mostly works...

Sorry for the delay with the TypeError: a bytes-like object is required, not 'str', which is actually trivial to reproduce with the following line in CL mode:

print("test\n") | cat

Note that the error does not occur if you replace the foreign cat with the native type. This seems like a regression. It happens here: https://github.com/iraf-community/pyraf/blob/d3e9b876dcf94bde05b590d1a5935abda4919382/pyraf/subproc.py#L862 That comment says that it converts str to bytes, but the stsci.tools function bytes_write() that did so was replaced with os.write() in https://github.com/iraf-community/pyraf/commit/ffbbde9a7a866430462f0f3b7b2bc498bd9b7f8b: https://github.com/iraf-community/pyraf/blob/c02ce8988f1a165c2ba348d6c388b0ddd6956f94/pyraf/subproc.py#L248 https://github.com/iraf-community/pyraf/blob/d3e9b876dcf94bde05b590d1a5935abda4919382/pyraf/subproc.py#L243

If I restore a conversion to bytes before the os.write() like this, the problem goes away and the above CL command works with both cat & type:

    if not isinstance(strval, bytes):
        strval = strval.encode('ascii')
    if os.write(self.toChild, strval) != len(strval):

Does that seem like the right solution? Was it just a misassumption during tidying up? I wonder whether anything else was broken by removing those conversions?

olebole commented 2 years ago

You are right; this was lost during the transition. I probably didn't carefully check that bytes_write() accepts (and converts) str as well as bytes. So, I just included your change. There was only one other place which used bytes_write() https://github.com/iraf-community/pyraf/blob/91e90b604edaad20142e86ef573491c41595a999/lib/pyraf/irafdisplay.py#L201-L214 Here is was also replaces with os.write(); however the method _write() is only called from _writeHeader() where the call is properly adjusted. The other changes in ffbbde9a7a866430462f0f3b7b2bc498bd9b7f8b (ndarr2bytes(), ndarr2str()) seem correct to me.

I now also include a few tests for redirection with both IRAF and foreign tasks, and corrected a few more places. Now this seems to work :-) (except piping of binary files between two tasks, see above)

jehturner commented 2 years ago

Great. I'm just looking at one other TypeError that's not quite the same thing but similar (update: got a bit waylaid again, but I think your PR already solves it).

jehturner commented 2 years ago

The other case that was failing with TypeError: write() argument must be str, not bytes was, in fact, writing binary to stdout. It reduces to this (which does require a binary input file):

gkiextract ("test.gki", "1-99", yes, verify-, default+, > "tmpfile")

The redirect is irrelevant here, but of course without it a lot of crap appears in the terminal if the command succeeds.

This was failing at IrafProcess.xmit() in the chan == 4 block. The self.stdout.write(txdata) at the end of that section was producing the TypeError because txdata comes from self.read(), which is documented to produce bytes (without the conditional IrafAsc2String being executed in between).

I was wondering what a .tb flag is, but I see that's not the problem here; this case appears to work with your PR, after adding the conditional self.stdout.buffer.write(txdata), which I suppose is just bypassing the text encoding? That makes sense.

jehturner commented 2 years ago

As for piping a binary stream, I believe that could be an issue here, eg. the help page for gkiextract shows this example usage:

Extract frames 1, 3 and 5 from metacode file "mc_file" and plot them on the device "vup":

cl> gkiextract mc_file 1,3,5 | stdplot dev=vup

However, I noticed that the CL script I was troubleshooting contains the following comment immediately before the gkiextract line, suggesting that this usage didn't work with PyRAF historically either:

    # Can't do this in PyRAF
    #gkiextract (plot//".gki", "1-99", yes, verify-, default+) |\
    #    gkimosaic ("STDIN", device="psfl", output="", nx=2, ny=2, rotate-,
    #    fill-, inter+, cursor="")

On the other hand, if I try to pipe the gkiextract command in my previous comment to stdplot in PyRAF 2.1.15 on Python 2, I do not get a pipe error (whether I can get it to produce a sensible plot is another matter), so maybe binary pipes were working for a while under Python 2. I doubt they are widely used in scripts though, if it didn't work at the time we ported our scripts from cl.

olebole commented 2 years ago

Maybe you can attach the test.gki? With the file dev$vdm.gki provided by IRAF, this seems to work fine:

--> gkiextract dev$vdm.gki 1-99 yes verify- default+ > tmpfile
Metacode file 'dev$vdm.gki' contains 8 frames

jehturner commented 2 years ago

Yes, I was just looking at producing a gki file in the test (I'd forgotten that there's one in dev$) and was seeing a bit of odd behaviour. Your command above does fail for me though ... are you sure you don't have your fix in place?

--> gkiextract dev$vdm.gki 1-99 yes verify- default+ > tmpfile
Traceback (innermost last):
  File "<CL script CL1>", line 1, in <module>
TypeError: write() argument must be str, not bytes

Presumably the 2nd arg should not be more than 1-8.

olebole commented 2 years ago

I actually do have my fixes in the place. I thought that was an issue that was not resolved with the current PR?

jehturner commented 2 years ago

No, I was saying that your PR fixed this additional failure from my previous test run. However, we should add the test for it (let me know if you'd like me to do that).

olebole commented 2 years ago

Ah, OK, this was a misunderstanding. I will add this to the tests.

Do you have more things that may not work yet wrt. i/o and encoding?

jehturner commented 2 years ago

I believe that's all the encoding-related issues that I currently have, until I re-run all the tests with the latest main branch (or if you want we could test the PR before merging it, as I think your branch is up-to-date with main, isn't it?).

olebole commented 2 years ago

I rebased #119 to main, so you can just test this branch. If all looks fine, I would like to merge it. If you like, you can still add a test with gkiextract; otherwise I would do it.

jehturner commented 2 years ago

I'm about half way through the distinct test failure modes that I identified last year, but some of the remaining cases are just things like small numerical differences. I might have to take a break after this one, to concentrate on something else (I had tentatively hoped we might already be able to use Pyraf on Python 3 for that, but it's not critical and it can't leave said job much longer because of other stuff coming up in March).

olebole commented 2 years ago

Great! So just drop me a note when you think this can be merged!

jehturner commented 2 years ago

So I added a test (locally), but in the process I found an unintended consequence of feeding pytest's tmpdir fixture to Stdout for gkiextract. Here's some pytest output showing a simplified example:

tmpdir = local('/tmp/pytest-of-jturner/pytest-62/test_oops0')

    def test_oops(tmpdir):
        outfile = tmpdir / 'testout.gki'
        iraf.gkiextract('dev$vdm.gki', '2-7', iraf.yes, verify=False,
>                       Stdout=outfile)

pyraf/tests/test_cli.py:333: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
pyraf/iraftask.py:836: in __call__
    return self.run(*args, **kw)
pyraf/iraftask.py:375: in run
    self._run(redirKW, specialKW)
pyraf/iraftask.py:880: in _run
    irafexecute.IrafExecute(self, iraf.getVarDict(), **redirKW)
pyraf/irafexecute.py:348: in IrafExecute
    irafprocess.run(task, pstdin=stdin, pstdout=stdout, pstderr=stderr)
pyraf/irafexecute.py:512: in run
    self.slave()
pyraf/irafexecute.py:645: in slave
    xmit()
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <pyraf.irafexecute.IrafProcess object at 0x7fea28625f90>

    def xmit(self):
        """Handle xmit data transmissions"""

        chan, nbytes = self.chanbytes()

        checkForEscapeSeq = (chan == 4 and (nbytes == 6 or nbytes == 5))
        xdata = self.read()

        if len(xdata) != 2 * nbytes:
            raise IrafProcessError("Error, wrong number of bytes read\n"
                                   f"(got {len(xdata):d}, "
                                   f"expected {2 * nbytes:d}, chan {chan:d})")
        if chan == 4:
            if self.task.getTbflag():
                # for tasks with .tb flag, stdout is binary data
                txdata = xdata
            else:
                # normally stdout is translated text data
                txdata = Iraf2Bytes(xdata)

            if checkForEscapeSeq:
                if (txdata[0:5] == b"\033+rAw"):
                    # Turn on RAW mode for STDIN
                    self.stdinIsraw = True
                    return

                if (txdata[0:5] == b"\033-rAw"):
                    # Turn off RAW mode for STDIN
                    self.stdinIsraw = False
                    return

                if (txdata[0:5] == b"\033=rDw"):
                    # ignore IRAF io escape sequences for now
                    # This mode enables screen redraw code
                    return

            if hasattr(self.stdout, "buffer"):
                self.stdout.buffer.write(txdata)
            else:
>               self.stdout.write(txdata.decode())
E               UnicodeDecodeError: 'utf-8' codec can't decode byte 0xff in position 0: invalid start byte

pyraf/irafexecute.py:824: UnicodeDecodeError

Where outfile is a file-like py._path.local.LocalPath object that has no buffer. This can be fixed by doing outfile = str(tmpdir / 'testout.gki') instead in the test, but I thought some extra check might be warranted?

olebole commented 2 years ago

I think the error is a bit misleading, as PyRAF does not handle pathlib.Path yet: If Stdout is a str, it will be opened as a file (text mode in case of stdout, which comes with a buffer attribute), while a pathlib.Path is not recognized and taken as-is https://github.com/iraf-community/pyraf/blob/b3c4de3277c39cc745cf49eeeeee677d2ea58162/pyraf/iraffunctions.py#L3725-L3728 (and following lines). Therefore, in your snipped, self.stdout is still a pathlib.Path, and so the decode() is tried (and fails).

The Stdout argument should be either a str, or f.e. a io.TextIOWrapper(io.BytesIO()) (which what I was using).

Supporting pathlib.Path (resp. os.PathLike) is a different issue, since one has to systematically check on which places we imply that a str being a path.

jehturner commented 2 years ago

Hmm, these changes seem to have tripled the number of tests failing :slightly_frowning_face: -- but it's probably just one or two things that still aren't quite right with the encoding & decoding.

All the spot checks I've done show TypeError: a bytes-like object is required, not 'str'. You can actually reproduce this just by starting pyraf and typing imaccess('blah.fits'), where the input file doesn't exist. It seems to be failing in xmit() again, this time in the chan == 5 block, for stderr (which appears to be a StringIO object):

https://github.com/iraf-community/pyraf/blob/a98f50a3d718a23ccbb62b28f1f478c3965cb1cc/pyraf/irafexecute.py#L831

Should this line actually read self.stderr.write(Iraf2Bytes(xdata).decode())? :grin: I'll try re-starting the tests with that change.

olebole commented 2 years ago

You are ofcourse right! That is a typo -- xdata is in IRAF style ("16-bit-ASCII"), so it first needs to be converted to bytes and only then to Unicode. I fixed and force-pushed this, and now your example succeeds. Reminder to myself: we need some tests that check the handling of stderr, since these are separate paths in the code)

jehturner commented 2 years ago

This run is looking more encouraging so far! The only new thing I've spotted is a text encoding problem in our regression testing scripts :see_no_evil:. It takes a long time to run everything though and I'll need to do some twice since I forgot something...

jehturner commented 2 years ago

I think this is looking good. After fixing several minor problems that showed up with our tests on Python 3, I've only found 1 new failure that is due to the changes in your branch. In that case the order of some lines printed to stdout changes when redirecting stdout to a file (previously there was a line of text with a newline before and after it, whereas now that line of text comes first, followed by 2 newlines). However, I think this is just a race condition that happens in a different order now that something has changed. The newlines are generated by CL's print(""), while the line of text comes from Gemini IRAF's SPP task glogprint. I haven't really dug into the race condition, but I've seen it vary without switching branch under some circumstances, so I think our script was wrong to rely on the ordering remaining as written in the first place.

It looks like all the problems we've worked on (Not a legal pipe record, Error in parameter, CL syntax error at '=' and various text encoding issues herein, as well as an Out of memory, associated with the others) have gone away. Quite a large proportion of the affected tests are now failing in other ways that were either known failure modes that I still have to look into or problems with our testing framework on Python 3 that hadn't shown up previously. There are now 189 tests failing (out of 932), rather than 242 last year. That's actually better than it looks, because of all the cases that now run correctly but have (eg.) small rounding differences compared with the reference results. I suspect that might actually be due to the newer numpy on Python 3, rather than PyRAF itself (which would knock out a lot of failures), but that's TBC.

I haven't combed through all the failures exhaustively again, but I've looked at enough that I think it should be fine to merge this. I see you've pushed some more changes since the version I tested, but it looks like you've just been tidying up the commits?

jehturner commented 2 years ago

Should we rename this issue to something a bit more general for future reference?

olebole commented 2 years ago

That all sounds great to me! I happily merge the PR tomorrow. I changed the issue title to something more descriptive. Do you think it is worth to make a new release then? It is not really urgent from my side, but especially this fix is quite worthwhile to distribute.

Thank you so much for your extensive tests, for digging into the issues, and for the patches!

jehturner commented 2 years ago

We don't specifically need a release yet, but it might be a good idea, since several problems have been cleared up and it could be a while before we resolve our final test failures. I believe all our tests now run and we're down to smallish differences in output, rather than things crashing, which seems like a significant step forward.

On the other hand, after some fiddling with installations I was able to re-run the tests with the same numpy 1.16.5 and astropy 2.0.9 used in Python 2 and neither made any difference to the failures -- so it appears that the many numerical differences are due either to Python 3 or changes inpyraf itself (or stsci.tools). Obviously a few mathematical things did change in v3, like division, but I'm not sure off the top of my head how that would translate into extensive rounding-type differences in FITS files. Actually, now that I'm reminding myself what did change, the behaviour of round() seems like a good candidate ... I should check on that.

iraf-community / pyraf

Problems with non-ASCII Unicode and binary stdin/stdout/stderror in Python 3 #117