spacepy / dbprocessing

Automated processing controller for heliophysics data
5 stars 4 forks source link

No output fail in postgres #95

Closed dnadeau-lanl closed 2 years ago

dnadeau-lanl commented 2 years ago

When adding a process with no output using "addFromConfig.py" The addProcess method fails. It seems that output_product needs to be set to None like for extra_params

I am using "sqlalchemy 1.4.7"

`` [SQL: INSERT INTO process (process_name, output_product, output_timebase, extra_params) VALUES (%(process_name)s, %(output_product)s, %(output_timebase)s, %(extra_params)s) RETURNING process.process_id] [parameters: {'process_name': 'senser_HRE_unpack', 'output_product': '', 'output_timebase': 'RUN', 'extra_params': None}] (Background on this error at: http://sqlalche.me/e/14/9h9h)

[product_senser_HRE_unpack] product_name = senser_HRE_unpack relative_path = /dbingested/hre/ level = 0.0 format = senser-HRE_pktReader.{Y}{m}{d}.{VERSION}.h5 product_description = SAGE HRE Packet Reader root file inspector_filename = inspect_senser_hre_rtm.py inspector_relative_path = . inspector_description = VC34 binary to unpacked root file inspector_version = 1.0.0 inspector_output_interface = 1 inspector_active = True inspector_date_written = 2021-05-29 inspector_newest_version = True inspector_arguments = instrument_name={INSTRUMENT}

[process_senser_HRE_unpack] process_name = senser_HRE_unpack required_input1 = product_senser_HRE_unpack output_product = output_timebase = RUN extra_params = code_filename = senser_HRE_wrapper_sage.py code_relative_path = /db_wrappers/src/ code_arguments = --workdir /projects/senser/data/hre/unpacked/ code_start_date = 2000-01-01 code_stop_date = 2050-12-31 code_description = unpack SENSER HRE STP6 VC34 file code_version = 2.1.0 code_output_interface = 1 code_active = True code_date_written = 2020-08-14 code_newest_version = True code_cpu = 1 code_ram = 1


### Error message/Traceback:

[SQL: INSERT INTO process (process_name, output_product, output_timebase, extra_params) VALUES (%(process_name)s, %(output_product)s, %(output_timebase)s, %(extra_params)s) RETURNING process.process_id] [parameters: {'process_name': 'senser_HRE_unpack', 'output_product': '', 'output_timebase': 'RUN', 'extra_params': None}] (Background on this error at: http://sqlalche.me/e/14/9h9h)


### OS, Python version, and dependency version information:

Linux-3.10.0-1160.45.1.el7.x86_64-x86_64-with-glibc2.17 sys.version_info(major=3, minor=9, micro=2, releaselevel='final', serial=0) sqlalchemy=1.4.7



### Version of dbprocessing
spacepy master branch

### Closure condition

This issue should be closed when:  I can ingest `output_product` defined with a empty string `''` into sqlachelmy. 

Should we change dbutil or addFromConfig.py?
jtniehof commented 2 years ago

I'll go digging on Monday. This is a good reminder I need to work on making the unit tests run against postgres...

I'm not seeing the full traceback in your report, just the failed SQL line, so I'll reproduce it first.

I suspect there will need to be both dbUtils and addFromConfig changes. The current tests are using an empty string actually in the database.

dnadeau-lanl commented 2 years ago

I did want to dump the full traceback. This happened calling dbu.addProcess(**tmp) in addFromConfig.py

jtniehof commented 2 years ago

Took a quick look, definitely need addFromConfig changes (EDIT, nope, see below). The problem is that sqlite doesn't enforce typing (it's just a hint). So the empty string gets stored as such on sqlite, but this blows up on postgres. I'll make that change but I also need to do some other checks, will PR later today or early tomorrow.

jtniehof commented 2 years ago

Nullable columns with numerical types, which are populated by createDB (i.e. are structural for a given mission):

Process.output_product
Code.ram
Code.cpu

addFromConfig has special handling of output_product when it's an empty string (basically to avoid translating the name.) ram and cpu have no special handling, in addFromConfig or DButils. We aren't doing much type checking in DButils (addCode, etc.). However, in addCode, we are calling "toNone" for empty argument lists (i.e. null strings), and in addProcess similarly for argument strings. So I think this is probably the correct approach, updating addProcess rather than addFromConfig.

The current test for getChildTree explicitly handles either a null entry (None) or empty string in the output product. So this is probably yet another thing for #7: databases with an empty string in that column should get converted.

I'll start down this path.

jtniehof commented 2 years ago

Another one for #7: apparently there used to be a not-null constraint on output_product, which needs to be removed for this to work.