IAU-ADES / ADES-Master

ADES implementation based on a master XML file
26 stars 7 forks source link

Valid ADES submission according to XSD but rejected when submitted #5

Closed schastel closed 2 years ago

schastel commented 4 years ago

Hi,

Large values for rms fields lead to rejection when submitted to MPC: The MPC returns: Error -- value too high: 28026532265984.00 I agree that the value does not make sense but since some values are already validated by the XSD (e.g. ra, dec...) and it would make sense to validate some other as well.

For instance, instead of: <xsd:element name="rmsRA" type="PosDecimalType"/> <xsd:simpleType name="PosDecimalType"> <xsd:restriction base="xsd:decimal"> <xsd:minExclusive value="0.0"/> </xsd:restriction> </xsd:simpleType> it would make sense to have (untested so likely incorrect): <xsd:element name="rmsRA" type="RmsRaType"/> <xsd:simpleType name="RmsRaType"> <xsd:restriction base="PosDecimalType"> <xsd:maxInclusive value="120.0"/> </xsd:restriction> </xsd:simpleType> to enforce submissions of values less than 120 (arcseconds).

Thanks!

Bill-Gray commented 4 years ago

Agreed that rmsRA and the like should have an upper limit. (At bare minimum, 180 degrees.) But some astrometry is simultaneously useful and has sigmas much greater than a mere two arcminutes. Some SOHO astrometry is good to a few arcminutes (if in C3 and either too bright or too faint). SWAN astrometry is good to about a degree. Similarly for some old Chinese comet observations, and some spacecraft observations where the spacecraft wasn't designed with the expectation that it'd be asked to do astrometry (I was asked about computing an orbit from such observations). Meteor observations from security and dashboard cameras can be about that bad.

I would argue that if you don't think your astrometry should ever be worse than N arcseconds, you should impose that check within your own software. The XSD limit should just test for definite bogusness (the value should be positive and less than 180 degrees).

schastel commented 4 years ago

120 was just a placeholder (maybe Michael will post the limits used in the MPC pipeline here)

I agree that the software should have its own check before submission but the role of the XSD is to validate the contents of the XML. The XSD should reflect the constraints when possible. If my submission will be rejected because of such an issue, I want to know it before curling the data. No need to put pressure on the MPC submission system when there is no need.

michael-rudenko commented 4 years ago

Here are the limits the MPC imposes in it's ADES bounds checking:

'ra': [0, 360.0],

'dec': [-90.0, 90.0],

'raStar': [0, 360.0],

'decStar': [-90.0, 90.0],

'dist': [0, 999999],

'pa': [0, 360.0],

'rmsRA': [0, 999999],

'rmsDec': [0, 999999],

'rmsDist': [0, 999999],

'rmsPA': [0, 360.0],

'rmsCorr': [-1.0, 1.0],

'delay': [0, 999999],

'rmsDelay': [0, 999999],

'rmsDoppler': [0, 999999],

'rmsMag': [0, 999999],

'photAp': [0, 999999],

'seeing': [0, 999999],

'exp': [0, 999999],

'rmsFit': [0, 999999],

'nStars': [0, 999999],

'frq': [0, 999999],

'uncTime': [0, 999999],

'rmsTime': [0, 999999]

Not too much thought went into coming up with them. They were just meant to be a crude sanity check for values with numeric ranges.

On Mon, May 18, 2020 at 10:43 PM Serge CHASTEL notifications@github.com wrote:

120 was just a placeholder (maybe Michael will post the limits used in the MPC pipeline here)

I agree that the software should have its own check before submission but the role of the XSD is to validate the contents of the XML. The XSD should reflect the constraints when possible. If my submission will be rejected because of such an issue, I want to know it before curling the data. No need to put pressure on the MPC submission system when there is no need.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/IAU-ADES/ADES-Master/issues/5#issuecomment-630544794, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACZ5T6W3DH7GUUOVCVPQXOLRSHW45ANCNFSM4NESG67A .

Bill-Gray commented 4 years ago

Hmmm... I could imagine having three XSDs against which a file could be validated. The 'standard ADES' one would apply the actual limits of bogusness (in this case, 0 < rmsRA < 180 degrees). MPC would use a modified one with lower limits (not much lower at present, but I could imagine MPC deciding that anything above, say, 120 arcseconds ought to be auto-rejected). That modified XSD would be available to the rest of us so that we could test for "MPC-compliant ADES" before submitting it. The lower limits set by MPC would be arbitrary and subject to change as MPC figured out what works for knocking out bogus submissions. Internally, an observatory might use a third XSD with still lower limits that reflect what would constitute an 'error' for your setup. If you're Gaia, the limit on rmsRA might be 0.1 arcsecond. If you observe with an astrolabe, 0.1 degree.

schastel commented 4 years ago

That's exactly what I intend to do for PS submissions (but I didn't want to add noise to this thread by mentioning it, e.g. rmsMag upper limit would be 2). I just need to make sure that the PS restrictions are not looser than the MPC ones and that's why we would need accurate MPC values.

rlseaman commented 4 years ago

Can’t say I’m delighted ADES remains a moving target.

Meanwhile the main CSS concern continues to be large incidental batches that are rejected due to one or a few duplicate observations. My own concern is whether CSS’s imminent switch to ADES will cause rejections to occur more frequently.

Validation of ADES consists of three four layers:

  1. Is it conforming XML?
  2. Validation against XSD schema (mostly syntactical)
  3. Limits just supplied by Mike – there are also likely implicit data type restrictions
  4. Validation against MPC curated lists (mostly semantic), https://www.minorplanetcenter.net/iau/info/ADESFieldValues.html

My recommendations:

  1. Truncated or non-conforming XML should be rejected.
  2. Validation should be forgiving, especially for optional fields. Generate email after the fact, e.g., don’t reject astrometry based on an overflowing rms uncertainty that didn’t have to be supplied in the first place.
  3. Don’t reject entire batches due to a few bad eggs – again, send email to resolve issues after the fact.
  4. MPC is now under SBN. SBN / PDS uses schematron for semantic validation (http://schematron.com). Suggest MPC work with SBN personnel to do the same, both for curated lists and pragmatic range limits.

Validation is not an all-or-nothing option. Preserving real-time submissions should be a higher priority.

Rob

schastel commented 4 years ago

I'm not sure that ADES is still a moving target. Its implementation might be one though.

For what it's worth (see the disclaimers in README), I forked ADES-Master and tried to implement the MPC restrictions in submit.xsd which I published in submit-mpc-sc.xsd: https://github.com/schastel/ADES-Master/tree/master/xsd

Bill-Gray commented 4 years ago

Heck, the punched-card format is 28 years old and still being modified. Most recently, it was to work around the impending A620K Crisis last July. Previous hacks modifications got us around the A100K crisis, modified observatory codes so they didn't have to be three digits, added the reference net byte in column 72, satellite position offsets in AU, same offsets to a variety of precisions, and references to MPECs and to MPCs past 100K (and again past 110K... the first extension wasn't much of one) and... well, you get the idea. And those are just the MPC-sanctioned changes. Any observation with an uncertainty of 28026532265984.00 arcseconds (Serge's example) is probably wrong in other ways, too. (BTW, Serge, was the observation bad in other ways?) I cannot imagine accepting such an observation uncritically. I'd have rejected it with a note saying what's wrong, allowing for a corrected version to be submitted. Which is exactly what MPC did.

schastel commented 4 years ago

It was P110jZ0. There was nothing too wrong with the detection (it was close to an edge). It's the automated rms measurements that were. The measurements for two other detections were fine. Anyway... It should have been caught on our end.

On Tue, May 19, 2020 at 4:06 PM Bill-Gray notifications@github.com wrote:

Heck, the punched-card format is 28 years old and still being modified. Most recently, it was to work around the impending A620K Crisis https://www.minorplanetcenter.net/mpec/K19/K19O55.html last July. Previous hacks modifications got us around the A100K crisis, modified observatory codes so they didn't have to be three digits, added the reference net byte in column 72, satellite position offsets in AU, same offsets to a variety of precisions, and references to MPECs and to MPCs past 100K (and again past 110K... the first extension wasn't much of one) and... well, you get the idea. And those are just the MPC-sanctioned changes. Any observation with an uncertainty of 28026532265984.00 arcseconds (Serge's example) is probably wrong in other ways, too. (BTW, Serge, was the observation bad in other ways?) I cannot imagine accepting such an observation uncritically. I'd have rejected it with a note saying what's wrong, allowing for a corrected version to be submitted. Which is exactly what MPC did.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IAU-ADES/ADES-Master/issues/5#issuecomment-631189692, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJSZ4GQ4ARXLPGEINFAZN3RSM3KVANCNFSM4NESG67A .

-- Serge Chastel Pan-STARRS Moving Objects Processing System Research Corp. Univ. of Hawaii IfA-B134; 1(808)956-6909

stevechesley commented 4 years ago

So it appears that, with just two exceptions, all of the PosDecimalType cases in the current standard line up exactly with those for which the MPC assigned limits of [0 999999]. So a pretty easy fix would be to change the definition of PosDecimalType to have an upper limit of a million.

The two exceptions are rmsPA, which can conveniently be given RAType to match the MPC's new limits. And nStars. Sure, a million stars in your solution may not be right, but who's counting? Actually the current type for that is xsd:positiveInteger, and this too could be revised to max out at a million.

All of the other limits that Mike shared are already a part of the ADES standard and implemented in submit.xsd.

I am not supportive of maintaining a third schema. submit.xsd is supposed to represent what the MPC is using for XML validation of submissions so that observers can validate and catch errors before submission. There is no point in carrying two such animals. If submit.xsd does not work as advertised then it should be modified. Folks can be more restrictive locally, but submit.xsd should represent what the MPC is doing. (The MPC is also checking for allowed values for some fields, which are not enumerated in the standard, but this decision was made long ago because those values change too often to be usefully incorporated into the standards documentation.)

So the simplest path forward would be to make some modest revisions to the ADES standard definitions to replicate the MPC's garbage trap. The one issue I see is that most of these limits do not have meaningful rationales. A million arcsec? A million stars? A million magnitudes? Are we okay with that? The alternative would be endless haranguing about what the right upper bound should be for each element (and a specific data type for each element), which I would not enjoy (and which is already underway above). Taking the MPC's approach seems more likely to actually lead to a conclusion and it does do what is intended.

I think that is the question for this particular Issue: Should ADES be modified to a) match the MPC's rather arbitrary limits? Or b) should smarter limits be applied? Or c) just ignore the MPC's limits in the official standard? I think a) is the right answer here, but would be happy to hear contrary thoughts.

stevechesley commented 4 years ago

And, Rob, I don't think that this means the ADES is still a moving target in any practical sense. The issue discussed here should not affect past or future batches, except those that have bogus values and would be eventually rejected anyway.

And, I don't agree that validation should be more forgiving for optional fields. I would not want to see the database polluted by allowing garbage values, and at submission (and before processing) is the right time to screen them out.

In any case, most of your concerns would seem to form the basis of a different issue than this one. But they look like they have more to do with MPC operations and less to do with the ADES standard and implementation, and so this may not be the right venue. Even so, you should feel free to open a new issue to discuss.

rlseaman commented 4 years ago

I am literally taking time from working on the again-refactored CSS ADES implementation to write this email. This will move to one of our telescopes tomorrow for testing…assuming you guys don’t change it too much tonight.

To cut to the chase: XML is great…well, good, and generally isomorphic to JSON. XML schema, however, sucks. Schematron backfills many of the problems and is how MPC should implement those curated lists. This should be an easy sell since SBN already uses schematron.

Defining conformance to a standard and providing tools to confirm same is not a rigid concept. The idea of satisfying a minimal level of conformance, rather than demanding maximal conformance, is neither radical nor an ethical failing. We should all strive to generate documents that adhere to the standard, and to accept documents that adhere “good enough”.

There is currently no schema validation for the 80-column format. And yet MPC will reject large incidental batches if a single observation is duplicated, rather than just excising that tracklet. Will individual ADES elements (e.g., , or tracklets of same) be individually rejected due to non-conforming optional elements, or will the whole batch?

I see Steve has a follow-up email arguing “I don't think that this means the ADES is still a moving target.” The G96 test set I’m working with has 114,000 observations. I’m rather pleased with how the refactoring has worked, layering ADES on FITS binary tables (a suggestion from Frank Shelly). Generating conforming ADES with the requested uncertainties, etc, takes ~75 columns and keywords per observation. That’s about 8 million numbers per a moderate night, maybe 12 million on our busiest. (And up to 30 million per night including the tracklets that currently don’t get presented for validation.)

The pipeline workflow that generates these numbers is complex and relies on both bespoke and third party software, catalogs, ephemerides, etc. Presumably Pan-STARRS has a similar scale MOPS problem. And it sounds like Serge has been playing whack-a-mole with different classes of exceptions, too.

A lot of this complexity is in service of ADES elements that are optional. If rmsDec, for instance, comes out negative (yes, I’m catching that, just an example) it does not necessarily mean the rest of the elements are questionable, but perhaps just that the CD matrix for one of the images was wonky (yes, I just whacked that mole), an otherwise harmless issue.

This community, especially the NEO surveys, often operates down in the noise. Our most problematic observations can also be our most valuable, and the tracklets may well require urgent follow-up. Overly stringent schema-based validation requirements risk introducing quite significant unnecessary latency in the service of formal compliance.

Rob

On 5/19/20, 9:16 PM, "Steve Chesley" wrote:

So it appears that, with just two exceptions, all of the PosDecimalType cases in the current standard line up exactly with those for which the MPC assigned limits of [0 999999]. So a pretty easy fix would be to change the definition of PosDecimalType to have an upper limit of a million.

The two exceptions are rmsPA, which can conveniently be given RAType to match the MPC's new limits. And nStars. Sure, a million stars in your solution may not be right, but who's counting? Actually the current type for that is xsd:positiveInteger, and this too could be revised to max out at a million.

All of the other limits that Mike shared are already a part of the ADES standard and implemented in submit.xsd.

I am not supportive of maintaining a third schema. submit.xsd is supposed to represent what the MPC is using for XML validation of submissions so that observers can validate and catch errors before submission. There is no point in carrying two such animals. If submit.xsd does not work as advertised then it should be modified. Folks can be more restrictive locally, but submit.xsd should represent what the MPC is doing. (The MPC is also checking for allowed valueshttps://www.minorplanetcenter.net/iau/info/ADESFieldValues.html for some fields, which are not enumerated in the standard, but this decision was made long ago because those values change too often to be usefully incorporated into the standards documentation.)

So the simplest path forward would be to make some modest revisions to the ADES standard definitions to replicate the MPC's garbage trap. The one issue I see is that most of these limits do not have meaningful rationales. A million arcsec? A million stars? A million magnitudes? Are we okay with that? The alternative would be endless haranguing about what the right upper bound should be for each element (and a specific data type for each element), which I would not enjoy (and which is already underway above). Taking the MPC's approach seems more likely to actually lead to a conclusion and it does do what is intended.

I think that is the question for this particular Issue: Should ADES be modified to a) match the MPC's rather arbitrary limits? Or b) should smarter limits be applied? Or c) just ignore the MPC's limits in the official standard? I think a) is the right answer here, but would be happy to hear contrary thoughts.

Bill-Gray commented 4 years ago

Hi Rob,

Truthfully, it sounds to me as if you aren't ready to submit uncertainties yet. No sin in that -- obviously, lots of people aren't -- but if you're getting negative uncertainties, you should fix that. As you point out, submitting sigmas is optional. But submitting wrong sigmas is not. Your options are to get it right, or not do it at all.

I don't see how, even with a wonky CD matrix, you'd get negative sigmas. Imaginary ones, yes (after taking the square root of a negative number), but negative?

That said, negative (or imaginary or complex) sigmas aren't really so bad. They tell me I should ignore your data. Ludicrously high sigmas will have the same effect; they are literally telling me you have no idea what the value is, and your observation will therefore have no/little effect on the orbit. The danger point comes in when you tell me you've nailed the object to a milliarcsecond when your data is only good to 0.3 arcsec. I may twist the orbit all out of recognition trying to fit that observation. (Or, more likely, I'll reject it because its 0.1-arcsec residuals correspond to several hundred sigmas. Or I may do as I believe JPL does, and put a floor on sigmas or add 0.1 arcsec in quadrature with them... I really don't like that idea, but might do it on an observatory-specific basis, with the floor for Gaia being a few milliarcseconds and the floor for SOHO C3 observations being about 15 arcsec, with a warning emitted that the sigmas look unusual for that site.)

A lot of checks are outside the realm of XML schema anyway, and I'm (mostly) okay with that. I flag observations made below the horizon (and, as you know, plan to flag those made outside of altitude, dec, hour angle, and elongation restrictions specified for a given telescope... and perhaps time of day limits). I flag ground-based optical observations made in daytime, and duplicate observations of various sorts (total duplicates, observations made at different RA/decs of the same object at the same time). Flagging of bad/suspicious sigmas really should, in my opinion, be done on a site-specific basis, and would best be done outside of the formal ADES specification.

rlseaman commented 4 years ago

It was an example, apparently a bad one.

XML schema are awkward and limited in utility. They should be kept simple. Many of the examples discussed in this thread are more pertinent to schematron, already in the SBN toolkit. Whatever validation framework, conformance should be more nuanced than simply rejecting entire batches.

Rob

On 5/20/20, 2:48 PM, "Bill-Gray" wrote:

External Email

Hi Rob,

Truthfully, it sounds to me as if you aren't ready to submit uncertainties yet. No sin in that -- obviously, lots of people aren't -- but if you're getting negative uncertainties, you should fix that. As you point out, submitting sigmas is optional. But submitting wrong sigmas is not. Your options are to get it right, or not do it at all.

I don't see how, even with a wonky CD matrix, you'd get negative sigmas. Imaginary ones, yes (after taking the square root of a negative number), but negative?

That said, negative (or imaginary or complex) sigmas aren't really so bad. They tell me I should ignore your data. Ludicrously high sigmas will have the same effect; they are literally telling me you have no idea what the value is, and your observation will therefore have no/little effect on the orbit. The danger point comes in when you tell me you've nailed the object to a milliarcsecond when your data is only good to 0.3 arcsec. I may twist the orbit all out of recognition trying to fit that observation. (Or, more likely, I'll reject it because its 0.1-arcsec residuals correspond to several hundred sigmas. Or I may do as I believe JPL does, and put a floor on sigmas or add 0.1 arcsec in quadrature with them... I really don't like that idea, but might do it on an observatory-specific basis, with the floor for Gaia being a few milliarcseconds and the floor for SOHO C3 observations being about 15 arcsec, with a warning emitted that the sigmas look unusual for that site.)

A lot of checks are outside the realm of XML schema anyway, and I'm (mostly) okay with that. I flag observations made below the horizon (and, as you know, plan to flag those made outside of altitude, dec, hour angle, and elongation restrictions specified for a given telescope... and perhaps time of day limits). I flag ground-based optical observations made in daytime, and duplicate observations of various sorts (total duplicates, observations made at different RA/decs of the same object at the same time). Flagging of bad/suspicious sigmas really should, in my opinion, be done on a site-specific basis, and would best be done outside of the formal ADES specification.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/IAU-ADES/ADES-Master/issues/5#issuecomment-631745140, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAQECDYKW2V44WZXURGJQOTRSRFZFANCNFSM4NESG67A.

Bill-Gray commented 4 years ago

Agreed, there should be a distinction between "ADES validation" (which is pretty much okay as it is) and "MPC validation" (which may be more variable and based on criteria MPC finds works well for them; I would give them wide discretion on that point). Rejection of batches, or of all observations submitted for a particular object, is a tricky subject and well outside the scope of the ADES definition. It's a decision for MPC to make, and then for orbit computers to decide what they will reject. (Generally speaking, if I have a lot of data, I am merciless about rejecting potentially dodgy data. If a site submits three observations of an object and two are bad, I'll toss the third as "probably not all that great either; guilty by association". If all I have are four observations, I am obviously less able to be picky.)

schastel commented 4 years ago

Thanks Bill, but with all respect due, what you do with the MPC data is not my concern and is not the purpose of this issue. We are data submitters. We want you to have the detections. You discard them, you don't. It's your call as long as you have them.

My point in this issue is that I want to know before I use expensive operations (i.e. typically curling and/or parsing the rejection emails, then manually reviewing them) if a batch will be rejected. If that can be expressed with a modified XSD, I want that XSD. If a Schematron scheme is provided, I'll use it. An extra layer of Relax-NG on top of it: I'll take it. Anything that will prevent expensive operations is welcome (especially when it comes to operations on thousands of tracklets).

Now an other issue (and maybe a different ticket should be open about it) is the implicit semantic behind the word "optional" in the XSD. For instance, rmsMag is optional. If it's not present in a PS submission while it usually is, which meaning does it have? Does that mean that the submission was manually measured and rms is not available? Is it an internal automated check that removed it so that the submission doesn't get rejected? It could even be a sloppy software deployment... If I give reasonable values for epoch, ra, and dec without telling that rmsMag = 28026532265984.0, how do you know that there is a concern with the detection. The issue is not about one detection which is obviously wrong, the issue is about the many detections with large-ish rmsRa, rmsDec. If I decide not to publish the rmsRa, rmsDec, I will lie about the quality of the datum.

Honestly I prefer to give rmsMag = 28026532265984.0. You - and everyone else - know immediately that the detection is dubious. It doesn't change much in terms of storage. Storing 999999 or 28026532265984 will use 4 or 8 bytes - NaN would be better btw and is supported in XSD and in PostgreSQL). Then the MPC or the community can flag it and tell us so that we prevent that kind of issues.

Thanks.

Bill-Gray commented 4 years ago

Ah, I see... you really just want to know what will trigger a rejection so that you can fix the problem(s) at your end, before going to all the trouble of sending the data. Seems like an excellent idea to me. Whatever criteria MPC uses for rejections ought to be publicly known. Dunno about it being part of ADES (it should be more flexible than that), but yes, you should know what the criteria are.

I'll bet, though, that you won't find them to be all that helpful. The constraints Michael posted are pretty darn loose. If you know that your best observations are good to, say, 0.02 arcsec, and your worst are good to three arcseconds, and that anything outside (say) 0.01 to five arcseconds reflects a definite error, why wouldn't you set limits accordingly?

MPC will probably accept it if your observations were made in daylight and below the horizon. They also have to have "lowest common denominator" constraints, able to accept data from new/inexperienced/careless observers. I'd not rely on them to catch errors. Nor, of course, would I say "it's probably bad data, but it does pass the MPC criteria".

Agreed, also, that a second ticket about the meaning of 'optional' would be good. I know what to do when an 80-column report lacks sigmas (because 99.9% of them do). Send me ADES data without sigmas, and things are definitely fuzzier.

stevechesley commented 4 years ago

Folks, I'd like to close this issue from a few months ago, and yet there is no closure in Mudville. The question at hand is whether or not the schema should be changed to emulate the MPC's ad hoc traps for bogus numerical values in ADES. If you have other topics that you want to discuss then open another issue, please.

The change to the schema can be done, as I describe above (May 19). For submit.xsd in particular, this change would be in the spirit of vetting the submission for fatal formatting errors that will cause the MPC to reject the submission. Not all errors can be identified by schema validation, but perhaps those that can be should be.

On the other hand, life is easier if we don't have to change anything, better still being the enemy of good enough. And anyway, bogus values over a million point to a flawed observation processing pipeline in need of repair. That should be rare enough that submission rejection could be an acceptable outcome.

Unless I hear a chorus of calls in the next week or so to revise the schema to include the MPC traps for bogus numbers I will leave it alone and close the issue.

Serge also asked if, for optional values like rmsMag, the presence or absence of a value implies anything special: "If it's not present in a PS submission while it usually is, which meaning does it have? Does that mean that the submission was manually measured and rms is not available? Is it an internal automated check that removed it so that the submission doesn't get rejected? It could even be a sloppy software deployment..."

The answer is that the absence does not communicate anything explicit, though one might infer that something is amiss. But if there is something special about that detection that makes it different from what one would expect from the PS data stream (like manual measurement or problematic photometry) it can and should be noted in the remarks.

rlseaman commented 4 years ago

I recall commenting at the time and imagine now-me would agree with whatever then-me said. Validation should apply observation-by-observation or tracklet-by-tracklet. Large incidental batches shouldn’t be rejected due to a single deemed-nonconforming metadatum.

Optional means optional. A missing value should not be inferred to imply anything one way or another. In particular, astrometry can take multiple paths through a pipeline (or manually, for that matter) and it may be simple logistics that governs the presence or absence of an element or attribute. For example, the CSS pipeline has both catalog and image-differencing components and combines the output before submission. The two routes generate different collections of metadata. Of course there are also “optional” elements like that must be omitted if the corresponding is omitted. On the other hand, a manually (re)measured point may retain an automatically generated rms value.

CSS is not currently submitting . The NEO community might benefit from a discussion of what is appropriate usage. In particular, the COM fields from 80-column format appear to have been deprecated in favor of the new comet submission web form (and perhaps others).

Rob

On 9/22/20, 8:33 PM, "Steve Chesley" wrote:

Folks, I'd like to close this issue from a few months ago, and yet there is no closure in Mudville. The question at hand is whether or not the schema should be changed to emulate the MPC's ad hoc traps for bogus numerical values in ADES. If you have other topics that you want to discuss then open another issue, please.

The change to the schema can be done, as I describe above (May 19). For submit.xsd in particular, this change would be in the spirit of vetting the submission for fatal formatting errors that will cause the MPC to reject the submission. Not all errors can be identified by schema validation, but perhaps those that can be should be.

On the other hand, life is easier if we don't have to change anything, better still being the enemy of good enough. And anyway, bogus values over a million point to a flawed observation processing pipeline in need of repair. That should be rare enough that submission rejection could be an acceptable outcome.

Unless I hear a chorus of calls in the next week or so to revise the schema to include the MPC traps for bogus numbers I will leave it alone and close the issue.

Serge also asked if, for optional values like rmsMag, the presence or absence of a value implies anything special: "If it's not present in a PS submission while it usually is, which meaning does it have? Does that mean that the submission was manually measured and rms is not available? Is it an internal automated check that removed it so that the submission doesn't get rejected? It could even be a sloppy software deployment..."

The answer is that the absence does not communicate anything explicit, though one might infer that something is amiss. But if there is something special about that detection that makes it different from what one would expect from the PS data stream (like manual measurement or problematic photometry) it can and should be noted in the remarks.

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/IAU-ADES/ADES-Master/issues/5#issuecomment-697113537, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAQECDZ4FQ4LYP4HSNM4SRLSHFT7FANCNFSM4NESG67A.

schastel commented 4 years ago

Hi, I wrote my own xsd which takes into account some of the MPC limits. Feel free to close the issue.

On Tue, Sep 22, 2020 at 5:33 PM Steve Chesley notifications@github.com wrote:

Folks, I'd like to close this issue from a few months ago, and yet there is no closure in Mudville. The question at hand is whether or not the schema should be changed to emulate the MPC's ad hoc traps for bogus numerical values in ADES. If you have other topics that you want to discuss then open another issue, please.

The change to the schema can be done, as I describe above (May 19). For submit.xsd in particular, this change would be in the spirit of vetting the submission for fatal formatting errors that will cause the MPC to reject the submission. Not all errors can be identified by schema validation, but perhaps those that can be should be.

On the other hand, life is easier if we don't have to change anything, better still being the enemy of good enough. And anyway, bogus values over a million point to a flawed observation processing pipeline in need of repair. That should be rare enough that submission rejection could be an acceptable outcome.

Unless I hear a chorus of calls in the next week or so to revise the schema to include the MPC traps for bogus numbers I will leave it alone and close the issue.

Serge also asked if, for optional values like rmsMag, the presence or absence of a value implies anything special: "If it's not present in a PS submission while it usually is, which meaning does it have? Does that mean that the submission was manually measured and rms is not available? Is it an internal automated check that removed it so that the submission doesn't get rejected? It could even be a sloppy software deployment..."

The answer is that the absence does not communicate anything explicit, though one might infer that something is amiss. But if there is something special about that detection that makes it different from what one would expect from the PS data stream (like manual measurement or problematic photometry) it can and should be noted in the remarks.

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/IAU-ADES/ADES-Master/issues/5#issuecomment-697113537, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADJSZ4BMPBCISYTES2ZA4DLSHFT7FANCNFSM4NESG67A .

-- Serge Chastel Pan-STARRS Moving Objects Processing System Research Corp. Univ. of Hawaii IfA-B134; 1(808)956-6909

stevechesley commented 2 years ago

The sanity checks discussed above (and many others) are now incorporated into ADES v2022 with the merging of the fieldWidth branch.

Feel free to look over the numerous restrictions now in v2022 and raise concerns in a new issue.