pacificclimate / pdp_util

A package supplying numerous apps for running PCIC's data server
GNU General Public License v3.0
0 stars 0 forks source link

'I/O operation on closed file' and SQL syntax error on some data download requests #44

Open rod-glover opened 2 years ago

rod-glover commented 2 years ago
2022-08-18 23:05:06 [17] [ERROR] Error handling request
Traceback (most recent call last):
  File "/opt/dockremap/.local/lib/python2.7/site-packages/gunicorn/workers/async.py", line 114, in handle_request
    for item in respiter:
  File "/opt/dockremap/pdp/error.py", line 74, in __call__
    start_response(status, response_headers, sys.exc_info())
  File "/opt/dockremap/.local/lib/python2.7/site-packages/gunicorn/http/wsgi.py", line 237, in start_response
    reraise(exc_info[0], exc_info[1], exc_info[2])
  File "/opt/dockremap/pdp/error.py", line 68, in __call__
    for block in response_iter:
  File "/opt/dockremap/.local/lib/python2.7/site-packages/ga_wsgi_client/__init__.py", line 41, in __call__
    for chunk in app_iter:
  File "/opt/dockremap/.local/lib/python2.7/site-packages/pdp_util/agg.py", line 85, in ziperator
    for name, responder in responders:
  File "/opt/dockremap/.local/lib/python2.7/site-packages/pdp_util/agg.py", line 196, in get_pcds_responders
    yield (name, handler(newenv, null_start_response))
  File "/opt/dockremap/.local/lib/python2.7/site-packages/pydap/handlers/pcic/__init__.py", line 78, in __call__
    app = SQLHandler(f.name)
  File "/opt/dockremap/.local/lib/python2.7/site-packages/pydap/handlers/sql/__init__.py", line 133, in __init__
    results = conn.execute(query)
  File "/opt/dockremap/.local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1268, in execute
    clause, params or {}
  File "/opt/dockremap/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 988, in execute
    return meth(self, multiparams, params)
  File "/opt/dockremap/.local/lib/python2.7/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/opt/dockremap/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1107, in _execute_clauseelement
    distilled_params,
  File "/opt/dockremap/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
    e, statement, parameters, cursor, context
  File "/opt/dockremap/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1466, in _handle_dbapi_exception
    util.raise_from_cause(sqlalchemy_exception, exc_info)
  File "/opt/dockremap/.local/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 383, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/opt/dockremap/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1244, in _execute_context
    cursor, statement, parameters, context
  File "/opt/dockremap/.local/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 552, in do_execute
    cursor.execute(statement, parameters)
ProgrammingError: (psycopg2.ProgrammingError) syntax error at or near "Depth"
LINE 1: ...MAX(CASE WHEN vars_id=832 THEN datum END) as Snow Depth,MAX(...
                                                             ^

[SQL: SELECT TempC, SolarRadiationWm, WindDirection, WindSpeedms, Pressurembar, Rainmm, GustSpeedms, obs_time, RH, DewPtC FROM (SELECT obs_time,MAX(CASE WHEN vars_id=505 THEN datum END) as DewPtC,MAX(CASE WHEN vars_id=507 THEN datum END) as GustSpeedms,MAX(CASE WHEN vars_id=502 THEN datum END) as Pressurembar,MAX(CASE WHEN vars_id=501 THEN datum END) as Rainmm,MAX(CASE WHEN vars_id=504 THEN datum END) as RH,MAX(CASE WHEN vars_id=535 THEN datum END) as RHn,MAX(CASE WHEN vars_id=534 THEN datum END) as RHx,MAX(CASE WHEN vars_id=832 THEN datum END) as Snow Depth,MAX(CASE WHEN vars_id=509 THEN datum END) as SolarRadiationWm,MAX(CASE WHEN vars_id=503 THEN datum END) as TempC,MAX(CASE WHEN vars_id=529 THEN datum END) as Tm,MAX(CASE WHEN vars_id=531 THEN datum END) as Tn,MAX(CASE WHEN vars_id=530 THEN datum END) as Tx,MAX(CASE WHEN vars_id=510 THEN datum END) as Wetness,MAX(CASE WHEN vars_id=682 THEN datum END) as Wetness_20cm,MAX(CASE WHEN vars_id=508 THEN datum END) as WindDirection,MAX(CASE WHEN vars_id=538 THEN datum END) as Wind_n,MAX(CASE WHEN vars_id=506 THEN datum END) as WindSpeedms from obs_raw WHERE (history_id = 14116) AND vars_id IN (505,507,502,501,504,535,534,832,509,503,529,531,530,510,682,508,538,506) GROUP BY obs_time ORDER BY obs_time) as foo LIMIT 1]
(Background on this error at: http://sqlalche.me/e/f405)
2022-08-18 23:05:06 [17] [ERROR] Error handling request
Traceback (most recent call last):
  File "/opt/dockremap/.local/lib/python2.7/site-packages/gunicorn/workers/async.py", line 114, in handle_request
    for item in respiter:
  File "/opt/dockremap/pdp/error.py", line 74, in __call__
    start_response(status, response_headers, sys.exc_info())
  File "/opt/dockremap/.local/lib/python2.7/site-packages/gunicorn/http/wsgi.py", line 237, in start_response
    reraise(exc_info[0], exc_info[1], exc_info[2])
  File "/opt/dockremap/pdp/error.py", line 68, in __call__
    for block in response_iter:
  File "/opt/dockremap/.local/lib/python2.7/site-packages/ga_wsgi_client/__init__.py", line 41, in __call__
    for chunk in app_iter:
  File "/opt/dockremap/.local/lib/python2.7/site-packages/pdp_util/agg.py", line 85, in ziperator
    for name, responder in responders:
  File "/opt/dockremap/.local/lib/python2.7/site-packages/pdp_util/agg.py", line 196, in get_pcds_responders
    yield (name, handler(newenv, null_start_response))
  File "/opt/dockremap/.local/lib/python2.7/site-packages/pydap/handlers/pcic/__init__.py", line 78, in __call__
    app = SQLHandler(f.name)
  File "/opt/dockremap/.local/lib/python2.7/site-packages/pydap/handlers/sql/__init__.py", line 133, in __init__
    results = conn.execute(query)
  File "/opt/dockremap/.local/lib/python2.7/site-packages/sqlalchemy/orm/session.py", line 1268, in execute
    clause, params or {}
  File "/opt/dockremap/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 988, in execute
    return meth(self, multiparams, params)
  File "/opt/dockremap/.local/lib/python2.7/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/opt/dockremap/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1107, in _execute_clauseelement
    distilled_params,
  File "/opt/dockremap/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
    e, statement, parameters, cursor, context
  File "/opt/dockremap/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1466, in _handle_dbapi_exception
    util.raise_from_cause(sqlalchemy_exception, exc_info)
  File "/opt/dockremap/.local/lib/python2.7/site-packages/sqlalchemy/util/compat.py", line 383, in raise_from_cause
    reraise(type(exception), exception, tb=exc_tb, cause=cause)
  File "/opt/dockremap/.local/lib/python2.7/site-packages/sqlalchemy/engine/base.py", line 1244, in _execute_context
    cursor, statement, parameters, context
  File "/opt/dockremap/.local/lib/python2.7/site-packages/sqlalchemy/engine/default.py", line 552, in do_execute
    cursor.execute(statement, parameters)
ProgrammingError: (psycopg2.ProgrammingError) syntax error at or near "Depth"
LINE 1: ...MAX(CASE WHEN vars_id=832 THEN datum END) as Snow Depth,MAX(...
                                                             ^

[SQL: SELECT TempC, SolarRadiationWm, WindDirection, WindSpeedms, Pressurembar, Rainmm, GustSpeedms, obs_time, RH, DewPtC FROM (SELECT obs_time,MAX(CASE WHEN vars_id=505 THEN datum END) as DewPtC,MAX(CASE WHEN vars_id=507 THEN datum END) as GustSpeedms,MAX(CASE WHEN vars_id=502 THEN datum END) as Pressurembar,MAX(CASE WHEN vars_id=501 THEN datum END) as Rainmm,MAX(CASE WHEN vars_id=504 THEN datum END) as RH,MAX(CASE WHEN vars_id=535 THEN datum END) as RHn,MAX(CASE WHEN vars_id=534 THEN datum END) as RHx,MAX(CASE WHEN vars_id=832 THEN datum END) as Snow Depth,MAX(CASE WHEN vars_id=509 THEN datum END) as SolarRadiationWm,MAX(CASE WHEN vars_id=503 THEN datum END) as TempC,MAX(CASE WHEN vars_id=529 THEN datum END) as Tm,MAX(CASE WHEN vars_id=531 THEN datum END) as Tn,MAX(CASE WHEN vars_id=530 THEN datum END) as Tx,MAX(CASE WHEN vars_id=510 THEN datum END) as Wetness,MAX(CASE WHEN vars_id=682 THEN datum END) as Wetness_20cm,MAX(CASE WHEN vars_id=508 THEN datum END) as WindDirection,MAX(CASE WHEN vars_id=538 THEN datum END) as Wind_n,MAX(CASE WHEN vars_id=506 THEN datum END) as WindSpeedms from obs_raw WHERE (history_id = 14116) AND vars_id IN (505,507,502,501,504,535,534,832,509,503,529,531,530,510,682,508,538,506) GROUP BY obs_time ORDER BY obs_time) as foo LIMIT 1]
(Background on this error at: http://sqlalche.me/e/f405)
Exception ValueError: 'I/O operation on closed file' in <bound method ZipFile.__del__ of <zipfile.ZipFile object at 0x7fe11b943f50>> ignored
rod-glover commented 2 years ago

The error message contains a list of variables that were being queried (ids 505, 507, ...) and this can tell us some useful information.

crmp=> select nw.network_name, v.vars_id, v.net_var_name, v.standard_name, v.cell_method  from meta_network nw natural join meta_vars v where vars_id IN (505,507,502,501,504,535,534,832,509,503,529,531,530,510,682,508,538,506);
 network_name | vars_id |   net_var_name   |               standard_name                |  cell_method  
--------------+---------+------------------+--------------------------------------------+---------------
 FLNRO-FERN   |     534 | RHx              | relative_humidity                          | time: maximum
 FLNRO-FERN   |     506 | WindSpeedms      | wind_speed                                 | time: mean
 FLNRO-FERN   |     538 | Wind_n           | wind_speed                                 | time: minimum
 FLNRO-FERN   |     504 | RH               | relative_humidity                          | time: mean
 FLNRO-FERN   |     505 | DewPtC           | dew_point_temperature                      | time: mean
 FLNRO-FERN   |     502 | Pressurembar     | air_pressure                               | time: point
 FLNRO-FERN   |     501 | Rainmm           | lwe_thickness_of_precipitation_amount      | time: sum
 FLNRO-FERN   |     529 | Tm               | air_temperature                            | time: mean
 FLNRO-FERN   |     530 | Tx               | air_temperature                            | time: maximum
 FLNRO-FERN   |     531 | Tn               | air_temperature                            | time: minimum
 FLNRO-FERN   |     503 | TempC            | air_temperature                            | time: point
 FLNRO-FERN   |     508 | WindDirection    | wind_from_direction                        | time: mean
 FLNRO-FERN   |     535 | RHn              | relative_humidity                          | time: minimum
 FLNRO-FERN   |     507 | GustSpeedms      | wind_speed_of_gust                         | time: maximum
 FLNRO-FERN   |     509 | SolarRadiationWm | surface_downwelling_shortwave_flux         | time: mean
 FLNRO-FERN   |     510 | Wetness          | volume_fraction_of_condensed_water_in_soil | time: point
 FLNRO-FERN   |     682 | Wetness_20cm     | volume_fraction_of_condensed_water_in_soil+| time:point
              |         |                  |                                            | 
 FLNRO-FERN   |     832 | Snow Depth       | surface_snow_depth                         | time: point
(18 rows)

Apart from some other questionable items such as the linefeed in the standard_name of var id #682, the problem for this query is, yes, the variable with a space in its net_var_name, namely #832, "Snow Depth".

rod-glover commented 2 years ago

The easy (easier) fix for this is to update the net_var_name of all variables when they are not valid SQL identifiers. We did this for the same problem in the metnorth database. Here's an extract from an email conversation leading up to the update to the database:

Hi Faron,

The metnorth database has many values of meta_vars.net_var_name that contain spaces, which causes trouble for some of our apps. In the CRMP database, none of these var names have spaces -- there are a variety of names, some with underscores, some with CamelCase, etc. but no spaces.

What we'd like to do is to replace every space in every variable name with underscore. Thus "Snow on Grnd" -> "Snow_on_Grnd", etc. Would that be OK with you?

There's also a single name that reads "Dew Point Temp (°​C)", which we'd like to rename to Dew_Point_Temp_C.

A more laborious and in principle fully robust fix is to quote the net_var_name where it is used as an identifier in the query. However, I suspect that this alone will not solve the problem, as the results of this query must be used somewhere, and in that broader use context there may well remain the question of a valid name.

A third solution may be to modify the code that creates new variables (crmprtd, for example) such that it modifies invalid net_var_names to valid ones. This will work so long as we know all programs that do this. And so long as there is no direct human input.

rod-glover commented 2 years ago

The following query shows that a relatively small number of stations, all in the FLNRO-FERN network, have variables with a net_var_name that is not a valid identifier, and that these are all one case.

NB: There are many other variables with problem net_var_names, but they are all in an unpublished network called DFO_CCG_lighthouse.

crmp=> SELECT nw.network_name, hx.station_name, v.net_var_name FROM meta_vars v NATURAL JOIN meta_station natural join meta_history hx natural join meta_network nw WHERE v.net_var_name !~ '^[a-zA-Z0-9_]*$' and nw.publish = true;
 network_name |             station_name             | net_var_name 
--------------+--------------------------------------+--------------
 FLNRO-FERN   | WADF1 Seedtree                       | Snow Depth
 FLNRO-FERN   | WADF2 Burn                           | Snow Depth
 FLNRO-FERN   | WADF4 Cabin                          | Snow Depth
 FLNRO-FERN   | WADF5 Alpine                         | Snow Depth
 FLNRO-FERN   | WADF7 Kokanee                        | Snow Depth
 FLNRO-FERN   | WADF3 Ross                           | Snow Depth
 FLNRO-FERN   | WADF6 Westfork                       | Snow Depth
 FLNRO-FERN   | WADF8 Ranger                         | Snow Depth
 FLNRO-FERN   | Perkins Peak Stn                     | Snow Depth
 FLNRO-FERN   | Gunnel3                              | Snow Depth
 FLNRO-FERN   | Bowron Pit                           | Snow Depth
 FLNRO-FERN   | Coalmine Wx                          | Snow Depth
 FLNRO-FERN   | EP1104.01 Beedy Creek (SBSdw10       | Snow Depth
 FLNRO-FERN   | EP1208 Itcha-Ilgachuz                | Snow Depth
 FLNRO-FERN   | EP1104.02 Quesnel Highland (ESSFwc3) | Snow Depth
 FLNRO-FERN   | EP1104.02 Quesnel Highland (ESSFwc3) | Snow Depth
 FLNRO-FERN   | EP1104.02 Mt. Tom (ESSFwk1)          | Snow Depth
 FLNRO-FERN   | EP1104.02 Quesnel Highland (ESSFwc3) | Snow Depth
 FLNRO-FERN   | Willow-BowronWx                      | Snow Depth
 FLNRO-FERN   | EP1104.02 Quesnel Highland (ESSFwc3) | Snow Depth
 FLNRO-FERN   | EP1104.02 Mt. Tom (ESSFwc3)          | Snow Depth
 FLNRO-FERN   | Blackhawk                            | Snow Depth
 FLNRO-FERN   | Boulder Creek                        | Snow Depth
 FLNRO-FERN   | BulkleyWx                            | Snow Depth
 FLNRO-FERN   | Canoe Mountain Stn                   | Snow Depth
 FLNRO-FERN   | CPFWx                                | Snow Depth
 FLNRO-FERN   | Crystal Lake                         | Snow Depth
 FLNRO-FERN   | Dunster                              | Snow Depth
 FLNRO-FERN   | EndakoWx                             | Snow Depth
 FLNRO-FERN   | GeorgeWx                             | Snow Depth
 FLNRO-FERN   | Gnat Pass                            | Snow Depth
 FLNRO-FERN   | GunnelWx                             | Snow Depth
 FLNRO-FERN   | Hourglass                            | Snow Depth
 FLNRO-FERN   | Hudson Bay Mtn2                      | Snow Depth
 FLNRO-FERN   | Kluskus                              | Snow Depth
 FLNRO-FERN   | Mackenzie Jxn                        | Snow Depth
 FLNRO-FERN   | Mcbride Mountain Stn                 | Snow Depth
 FLNRO-FERN   | MiddleforkWx                         | Snow Depth
 FLNRO-FERN   | NondaWx                              | Snow Depth
 FLNRO-FERN   | PinkWx                               | Snow Depth
 FLNRO-FERN   | SaxtonWx                             | Snow Depth
 FLNRO-FERN   | SmithersCmpd                         | Snow Depth
 FLNRO-FERN   | ThompsonWx                           | Snow Depth
(43 rows)
rod-glover commented 2 years ago

Final note: We can remove this problem by one of the measures noted above, but it does not address another salient fact, which is that downloads of these stations crap out in a highly unsatisfactory and uninformative way. See #8 and #26 for related but not identical problems.

I suspect the problem lies in this code, which does not catch any errors that may occur while processing the "responders", which are the objects that produce data for inclusion in the request response.

jameshiebert commented 2 years ago

If we wanted to go the route of quoting the use of net_var_name from any application code, this is easy to do from SQLAlchemy:

https://docs-sqlalchemy.readthedocs.io/ko/latest/core/sqlelement.html#sqlalchemy.sql.elements.quoted_name

I suspect that this alone will not solve the problem, as the results of this query must be used somewhere

We'd have to look into where they get used, but I believe that it would only be in output data. So cells of a CSV or XLSX file, or variable names in a NetCDF file. Worth exploring.

A 4th option would be to put a check constraint on that column, such that only SQL identifiers are valid data. That way no automated process or manual entry would be able to add data with spaces (or other incompatible characters).

rod-glover commented 2 years ago

Nice to know about the SQLAlchemy quoting. I don't think it can be applied here, as all this takes place in stored procedures in the database. In particular, the function crmp.getStationVariableTable generates the SELECT query with the bad identifiers, and crmp.query_one_station executes it. None of this is in reach of SQLAlchemy quoting since it doesn't construct the query, just invokes the latter function.

So cells of a CSV or XLSX file, or variable names in a NetCDF file. Worth exploring.

Hmm. I think the only immediate problem might be with a NetCDF file; I'll look into variable name syntax there. However, even if we can use a funny variable name in both cases, it might be awkward for the users of the file. Maybe better to enforce syntax up front.

I thought of check constraint over the weekend, too. Glad to hear you validate the idea. It might be the simplest way to do it.

Quoting would be nicer in some ways, though. Both require updates to the database, i.e., schema migrations. We can hand-bomb it if it's urgent, though -- given we are about a month out from the platform migration of crmp.

rod-glover commented 1 year ago

This problem has recurred and/or is still present:

db/crmp=> select network_id, network_name, vars_id, net_var_name
db/crmp-> from meta_network 
db/crmp-> natural join meta_vars 
db/crmp-> where net_var_name ~ '\W'
db/crmp-> order by network_name, net_var_name;
 network_id | network_name | vars_id |        net_var_name        
------------+--------------+---------+----------------------------
         11 | FLNRO-FERN   |     832 | Snow Depth
         11 | FLNRO-FERN   |     833 | Soil Temp
         11 | FLNRO-FERN   |     835 | Water Content, m3/m3 15 cm
         11 | FLNRO-FERN   |     836 | Water Content, m3/m3 30 cm
         11 | FLNRO-FERN   |     834 | Water Content, m3/m3 5 cm
         13 | MoTI         |     491 | ATM.PRESS
         13 | MoTI         |     483 | DEW.POINT
         13 | MoTI         |     481 | HRLY.PRCP
         13 | MoTI         |     486 | MAX.GUST.1
         13 | MoTI         |     489 | MAX.GUST.2
         13 | MoTI         |     479 | MAX.TEMP
         13 | MoTI         |     478 | MIN.TEMP
         13 | MoTI         |     482 | OBS.TYPE
         13 | MoTI         |     480 | PRES.TEMP
         13 | MoTI         |     484 | SNOW.PACK
         13 | MoTI         |     487 | WND.DIR.1
         13 | MoTI         |     490 | WND.DIR.2
         13 | MoTI         |     485 | WND.SPD.1
         13 | MoTI         |     488 | WND.SPD.2
(19 rows)

We can also re-run the previous query, which gives station names associated to these:

db/crmp=> SELECT nw.network_name, hx.station_name, v.net_var_name FROM meta_vars v NATURAL JOIN meta_station natural join meta_history hx natural join meta_network nw WHERE v.net_var_name !~ '^[a-zA-Z0-9_]*$' and nw.publish = true;
 network_name |             station_name             |        net_var_name        
--------------+--------------------------------------+----------------------------
 FLNRO-FERN   | WADF1 Seedtree                       | Water Content, m3/m3 5 cm
 FLNRO-FERN   | WADF1 Seedtree                       | Water Content, m3/m3 30 cm
 FLNRO-FERN   | WADF1 Seedtree                       | Water Content, m3/m3 15 cm
 FLNRO-FERN   | WADF1 Seedtree                       | Soil Temp
 FLNRO-FERN   | WADF1 Seedtree                       | Snow Depth
 FLNRO-FERN   | WADF2 Burn                           | Water Content, m3/m3 5 cm
 FLNRO-FERN   | WADF2 Burn                           | Water Content, m3/m3 30 cm
 FLNRO-FERN   | WADF2 Burn                           | Water Content, m3/m3 15 cm
 FLNRO-FERN   | WADF2 Burn                           | Soil Temp
 FLNRO-FERN   | WADF2 Burn                           | Snow Depth

...

 FLNRO-FERN   | NorthFraser                          | Water Content, m3/m3 5 cm
 FLNRO-FERN   | NorthFraser                          | Water Content, m3/m3 30 cm
 FLNRO-FERN   | NorthFraser                          | Water Content, m3/m3 15 cm
 FLNRO-FERN   | NorthFraser                          | Soil Temp
 FLNRO-FERN   | NorthFraser                          | Snow Depth
 FLNRO-FERN   | StoneCreek                           | Water Content, m3/m3 5 cm
 FLNRO-FERN   | StoneCreek                           | Water Content, m3/m3 30 cm
 FLNRO-FERN   | StoneCreek                           | Water Content, m3/m3 15 cm
 FLNRO-FERN   | StoneCreek                           | Soil Temp
 FLNRO-FERN   | StoneCreek                           | Snow Depth
(255 rows)

The new variables have pretty much multiplied the old problem by 5.

It looks as if we may not have fixed the original "Snow Depth" variable, or else it has been re-added. And there are now several others.

rod-glover commented 1 year ago

We can see if we did or did not fix the problem earlier with this query:

db/crmp=> SELECT nw.network_name, hx.station_name, v.net_var_name FROM meta_vars v NATURAL JOIN meta_station natural join meta_history hx natural join meta_network nw WHERE nw.network_name = 'FLNRO-FERN' and hx.station_name = 'WADF1 Seedtree' order by net_var_name;
 network_name |  station_name  |        net_var_name        
--------------+----------------+----------------------------
 FLNRO-FERN   | WADF1 Seedtree | DewPtC
 FLNRO-FERN   | WADF1 Seedtree | GustSpeedms
 FLNRO-FERN   | WADF1 Seedtree | Precip_Climatology
 FLNRO-FERN   | WADF1 Seedtree | Pressurembar
 FLNRO-FERN   | WADF1 Seedtree | Rainmm
 FLNRO-FERN   | WADF1 Seedtree | RH
 FLNRO-FERN   | WADF1 Seedtree | RHn
 FLNRO-FERN   | WADF1 Seedtree | RHx
 FLNRO-FERN   | WADF1 Seedtree | Snow Depth
 FLNRO-FERN   | WADF1 Seedtree | Soil Temp
 FLNRO-FERN   | WADF1 Seedtree | SolarRadiationWm
 FLNRO-FERN   | WADF1 Seedtree | TempC
 FLNRO-FERN   | WADF1 Seedtree | Tm
 FLNRO-FERN   | WADF1 Seedtree | T_mean_Climatology
 FLNRO-FERN   | WADF1 Seedtree | Tn
 FLNRO-FERN   | WADF1 Seedtree | Tn_Climatology
 FLNRO-FERN   | WADF1 Seedtree | Tx
 FLNRO-FERN   | WADF1 Seedtree | Tx_Climatology
 FLNRO-FERN   | WADF1 Seedtree | Water Content, m3/m3 15 cm
 FLNRO-FERN   | WADF1 Seedtree | Water Content, m3/m3 30 cm
 FLNRO-FERN   | WADF1 Seedtree | Water Content, m3/m3 5 cm
 FLNRO-FERN   | WADF1 Seedtree | Wetness
 FLNRO-FERN   | WADF1 Seedtree | Wetness_20cm
 FLNRO-FERN   | WADF1 Seedtree | WindDirection
 FLNRO-FERN   | WADF1 Seedtree | Wind_n
 FLNRO-FERN   | WADF1 Seedtree | WindSpeedms
(26 rows)

Argh, we did not. Bad on us.

Given the prior analysis, I vote for:

rod-glover commented 1 year ago

Here's the first outing:

db/crmp=> BEGIN;
BEGIN
Time: 11.873 ms
db/crmp=*> UPDATE meta_vars SET net_var_name = regexp_replace(net_var_name, '\W', '_', 'g') WHERE net_var_name ~ '\W';
ERROR:  relation "meta_vars" does not exist
LINE 2:     FROM meta_vars
                 ^
QUERY:  SELECT matview_queue_refresh_row('collapsed_vars_mv', history_id)
    FROM meta_vars
    NATURAL JOIN meta_network
    NATURAL JOIN meta_station
    NATURAL JOIN meta_history
    WHERE vars_id = OLD.vars_id
CONTEXT:  PL/pgSQL function crmp.collapsed_vars_mv_meta_vars_update() line 8 at PERFORM
Time: 13.442 ms
db/crmp=!> ROLLBACK;

So the change to meta_vars causes the trigger function crmp.collapsed_vars_mv_meta_vars_update() to be run. That trigger function tries to run matview_queue_refresh_row(), using meta_vars. So far, so ordinary.

But somehow at that point it doesn't find meta_vars ?!? The user is crmp, which has a default search path of "$user", public. That ought to work, i.e., the search path resolves to crmp, public, and it does work at the "top level". But somehow within the trigger function that is not the search path.

rod-glover commented 1 year ago

@jameshiebert suggested

I believe that you should be able to use any user that has and write access ("steward" role?), which your user does. I don't have any insight as to why the relation can't be found w/in the trigger. If you want to decouple the problems, you could probably disable the trigger, run your update, and re-enable (and run) the trigger.

First to try is a different r/w user, namely steward.

rod-glover commented 1 year ago

As James pointed out, user rglover has suitable roles.

db/crmp=> BEGIN;
BEGIN
Time: 9.160 ms
db/crmp=*> UPDATE meta_vars SET net_var_name = regexp_replace(net_var_name, '\W+', '_', 'g') WHERE net_var_name ~ '\W';
UPDATE 19
Time: 130.328 ms
db/crmp=*> COMMIT;
COMMIT
Time: 9.277 ms

Check results:

db/crmp=> select network_id, network_name, vars_id, net_var_name
db/crmp-> from meta_network 
db/crmp-> natural join meta_vars 
db/crmp-> where net_var_name ~ '\W'
db/crmp-> order by network_name, net_var_name;
 network_id | network_name | vars_id | net_var_name 
------------+--------------+---------+--------------
(0 rows)

Good.

rod-glover commented 1 year ago

For the moment, this ought to solve this particular problem. Will close when evidence that things are working better comes in.