IAU-ADES / ADES-Master

ADES implementation based on a master XML file
26 stars 7 forks source link

Converting program codes between obs80 and ADES #53

Closed stevechesley closed 6 months ago

stevechesley commented 6 months ago

After testing mpc80coltoxml.py and xmltompc80col.py it seems that the program code mapping was not working correctly.

My understanding is that the program code given in column 14 of obs80 should be converted to an integer according to the order given at top here. For ADES prog, this integer is converted to a 2-char base 62 number where the base-62 counting order is [0...9A...Za...z]. So, for example, 0 -> 0 -> 00 and z -> 93 -> 1V This gives the obs80 to ADES translation for the 94 available obs80 program codes.

@federicaspoto I have added a convenient (at least for me) doc/progDecoder.txt. Can you please verify that what I have there is correct?

Going the other way, for prog values less than or equal to 1V = 93 there is a clear translation to the 1-char program code in obs80. For values of 1W = 94 and larger the program code is neglected and a blank is placed in column 14.

I believe I have fixed (and tested) the relevant routines in packUtil.py, but review is needed.

A related issue is the problem that obs80 column 14 can be either a note or a program code, and so it is a bit sticky to decide how to treat col. 14. See the comment I have added to mpc80coltoxml.py. Right now [A..Za..z] are always treated as notes, but this is clearly wrong since the MPC reports many alphabetical program codes. Options are to refer to a static copy of that file when converting obs80 to ADES, or to do something more naive. I will open a separate issue for this...

Bill-Gray commented 6 months ago

Interesting. I've assumed that if column 14 of a punched-card record corresponds both to a program code for that observatory and to a note, you couldn't tell which it was. Now I have two questions, though :

(1) If the two-digit base-62 <prog></prog> can run from 00 to 1z, we can have 124 program codes for a given observatory code. If, as Federica suggests, it can run from 00 to 2z, we can have 186 program codes. I kinda thought it could run up to zz = 62^2 = 3844 program codes. Is there an upper limit? (If so, I'd think we should use that, rather than set an upper limit on 'first'.)

(2) Are there cases where an obscode had no program codes, and therefore used alphabetical notes freely, and then did have program codes, some of which might be alphabetical?

federicaspoto commented 6 months ago

(1) If the two-digit base-62 <prog></prog> can run from 00 to 1z, we can have 124 program codes for a given observatory code. If, as Federica suggests, it can run from 00 to 2z, we can have 186 program codes. I kinda thought it could run up to zz = 62^2 = 3844 program codes. Is there an upper limit? (If so, I'd think we should use that, rather than set an upper limit on 'first'.)

@Bill-Gray I'm sorry, I believe that my comment was not clear. There is no upper limit. I was just making an example showing that the code was not able to handle '2', but in general the code won't be able to handle anything that doesn't have '1' or '0' as first.

federicaspoto commented 6 months ago

(2) Are there cases where an obscode had no program codes, and therefore used alphabetical notes freely, and then did have program codes, some of which might be alphabetical?

@Bill-Gray I am not sure I fully understood your question. Do you mean a case like the following one

 stn | notes | prog |                                      obs80                                       | status 
-----+-------+------+----------------------------------------------------------------------------------+--------
 H06 | K     | 0o   | B8254        IC2014 11 27.16433 02 58 53.70 +35 11 11.0          18.9 Vq~1DbGH06 | P
 H06 | K     | 0o   | B8254        IC2014 11 27.18706 02 58 52.30 +35 11 01.0          19.3 Vq~1DbGH06 | P
 H06 | K     | 0o   | B8254        IC2014 11 27.20978 02 58 50.90 +35 10 52.3          18.7 Vq~1DbGH06 | P
 H06 | K     | 0o   | B8254        IC2014 11 27.22992 02 58 49.67 +35 10 42.2          18.0 Vq~1DbGH06 | P

where I is the program code (I=0o in base62), but K is the note?

Bill-Gray commented 6 months ago

@federicaspoto - in the case you mention, let's say that (H06) originally had no program codes assigned to it. Somebody at that time might have submitted an 80-column observation with the byte in column 14 set to 'K' to indicate 'stacked image', and it would have been published that way, with that meaning.

For a later observation, published once (H06) had one or more program codes, an observation with a 'K' in column 14 would mean program code K. So column 14 would be a "note" for early observations (including published ones) and a "program code" for later ones.

I wrote all that before seeing Matt's comment and your reply. I'd not thought about the added difference between a submitted and a published observation; i.e., if the reference is blank, you can be reasonably sure that column 14 is a note. (Assuming nobody is shoving their program code in there. How confident are we about that?)

(This all reminds me that I've long thought it'd be nice if ITF observations did have a reference set in columns 72-77. Maybe ITnnn, where nnn = time observation was added to the ITF, encoded as a base-62 count of days since, say, 1900. Perhaps a general policy : anything that comes out of MPC, and presumably has had a program code inserted where a note once might have been, should have some sort of a non-blank reference.)

stevechesley commented 6 months ago

Do you have any tests in the new_test folder that we can run to see if the routine works?

No formal tests, but I have manually tested a range of values and verified that the packing/unpacking of ADES prog works as desired. I don't think it's worth adding unit tests for these simple functions.

Also progDecoder.txt stops at 185 = 2z just because I lost interest. It should go on to 3843 = zz, but I was making the file by hand. I will add something to show a snip in the file.

federicaspoto commented 6 months ago

Also progDecoder.txt stops at 185 = 2z' just because I lost interest. It should go on to3843=zz`, but I was making the file by hand. I will add something to show a snip in the file.

That's fine! You don't have to add anything in there. But it wasn't clear to me how the code would work for something like '2z'. If you have the files and could add a couple of test, that would be great.

stevechesley commented 6 months ago

OK, I think I have it all correct here. This PR does a few things:

@federicaspoto can you make another pass? (I'm not savvy enough to make a pytest for the program code translation, but I did test it thoroughly. If you think it's important please open an issue and we'll look after it.)

stevechesley commented 6 months ago

One more thing to add to the bullet list above:

This was brought over here from PR #53 since that PR needed the packUtil.py from this branch...