rubin-dp0 / Support

Submit Github Issues related to DP0
MIT License
1 stars 3 forks source link

[BUG] failure to catch syntax error in SQL #34

Closed rmandelb closed 1 year ago

rmandelb commented 1 year ago

Describe the bug When running a TAP query, it executes an SQL query with a bug without identifying the bug, silently returning results that don't quite make sense.

To Reproduce Steps to reproduce the behavior:

  1. Set up TAP queries following the process in this contributed notebook.
  2. Compare the results of this query, which does not have the bug I mentioned:

use_center_coords = "62, -37" use_radius = "0.05" query1 = "SELECT " + \ "objectId, ra, dec, clean, " + \ "mag_i_cModel, mag_g_cModel, psf_fwhm_i, extendedness, blendedness " + \ "FROM dp01_dc2_catalogs.object " + \ "WHERE CONTAINS(POINT('ICRS', ra, dec), " + \ "CIRCLE('ICRS', " + use_center_coords + ", " + use_radius + ")) = 1 " + \ "AND clean = 1 " + \ "AND mag_i_cModel < 26.0"

against the results of this query, which is buggy -- it's missing the comma after clean:

use_center_coords = "62, -37" use_radius = "0.05" query1 = "SELECT " + \ "objectId, ra, dec, clean " + \ "mag_i_cModel, mag_g_cModel, psf_fwhm_i, extendedness, blendedness " + \ "FROM dp01_dc2_catalogs.object " + \ "WHERE CONTAINS(POINT('ICRS', ra, dec), " + \ "CIRCLE('ICRS', " + use_center_coords + ", " + use_radius + ")) = 1 " + \ "AND clean = 1 " + \ "AND mag_i_cModel < 26.0"

You will see that both queries run and return the same number of entries. However, it silently omits a clean named column, and the rest of the entries get shifted over by one -- i.e., mag_i_cModel entries correspond to the boolean values that should have been in clean, etc. The actual mag_i_cModel values do not get returned at all.

Expected behavior It should return a syntax error due to the missing comma between two of the requested columns under SELECT.

Screenshots N/A

URL N/A

Desktop (please complete the following information):

Smartphone (please complete the following information): N/A

Additional context None

ktlim commented 1 year ago

This is unfortunately legal SQL and ADQL. SELECT a AS foo can be written without the AS keyword to define the alias: SELECT a foo. So the "bad" code is aliasing the clean values with the name mag_i_cModel.

If the values of the subsequent columns are not correct, however, that would be a bug.

rmandelb commented 1 year ago

Oh -- I did not realize that SELECT a AS foo could be aliased in this way. Yikes. That seems like poor language design but obviously not up to you.

I have checked more carefully and the problem seems restricted to mag_i_cModel rather than subsequent columns. I thought I checked this before but I misinterpreted the output when I looked at it before.

Thanks for clarifying - I'll close this non-bug-report.

MelissaGraham commented 1 year ago

This is really good to know, I didn't know this either and it looks like a common trap -- thanks @rmandelb for raising this and @ktlim for the answer.

wmwv commented 1 year ago

Thanks @rmandelb for flagging this and teaching us all* something! I also didn't know about this aspect of the SQL standard.

I have made exactly this mistake of missing a comma in a block of column names and just noticed that I wasn't getting the right number of columns and just fixed it, but I never understood why the query didn't fail.

[*] I guess all except @ktlim , that is 😄 .