Open jledrumics opened 5 months ago
Python Apache BEAM version 2.44
When using WriteToBigquery with STORAGE_WRITE_API method and some NULLABLE nested field, even if the field is not in the input dict, or is None, the code will try to resolve the inner fields and then fails. This is in the beam_row_from_dict method.
beam_row_from_dict
A failing example there :
from apache_beam.io.gcp.bigquery_tools import beam_row_from_dict schema = { "fields": [ {"name": "log_id", "type": "STRING", "mode": "REQUIRED"}, { "name": "nested", "type": "RECORD", "mode": "NULLABLE", "fields": [ {"name": "id", "type": "STRING", "mode": "REQUIRED"}, {"name": "source", "type": "STRING", "mode": "REQUIRED"}, {"name": "channel", "type": "STRING", "mode": "NULLABLE"}, ], }, ] } row = { "log_id": "727254-32022246-026", "nested": None, # same when quoting the field } beam_row = beam_row_from_dict(row, schema) print(beam_row)
Priority: 2 (default / most bugs should be filed as P2)
I suggest you just do this:
row = { "log_id": "727254-32022246-026", "nested": {"id": None, "source": None, "channel":None}, }
What happened?
Python Apache BEAM version 2.44
When using WriteToBigquery with STORAGE_WRITE_API method and some NULLABLE nested field, even if the field is not in the input dict, or is None, the code will try to resolve the inner fields and then fails. This is in the
beam_row_from_dict
method.A failing example there :
Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components