I have spotted a number of .encode("utf-8") usages in the CSV writers, e.g.:
grep encode ../ukcp-data-processor/ukcp_dp/file_writers/*
grep: ../ukcp-data-processor/ukcp_dp/file_writers/__pycache__: Is a directory
../ukcp-data-processor/ukcp_dp/file_writers/_write_csv_cdf.py: var = self.input_data.get_value_label(InputType.VARIABLE)[0].encode("utf-8")
../ukcp-data-processor/ukcp_dp/file_writers/_write_csv_default.py: output_data_file.write(title.encode("utf-8").replace("\n", " "))
../ukcp-data-processor/ukcp_dp/file_writers/_write_csv_jp.py: x = self.input_data.get_value_label(InputType.VARIABLE)[0].encode("utf-8")
../ukcp-data-processor/ukcp_dp/file_writers/_write_csv_jp.py: y = self.input_data.get_value_label(InputType.VARIABLE)[1].encode("utf-8")
../ukcp-data-processor/ukcp_dp/file_writers/_write_csv_pdf.py: var = self.input_data.get_value_label(InputType.VARIABLE)[0] #.encode("utf-8")
../ukcp-data-processor/ukcp_dp/file_writers/_write_csv_plume.py: var = self.input_data.get_value_label(InputType.VARIABLE)[0].encode(
../ukcp-data-processor/ukcp_dp/file_writers/_write_csv_plume.py: var = self.input_data.get_value_label(InputType.VARIABLE)[0].encode("utf-8")
../ukcp-data-processor/ukcp_dp/file_writers/_write_csv_postage_stamp_map.py: var = self.input_data.get_value_label(InputType.VARIABLE)[0].encode("utf-8")
../ukcp-data-processor/ukcp_dp/file_writers/_write_csv_sample.py: self.input_data.get_value_label(InputType.VARIABLE)[i].encode("utf-8")
../ukcp-data-processor/ukcp_dp/file_writers/_write_csv_sample.py: var = self.input_data.get_value_label(InputType.VARIABLE)[i].encode("utf-8")
../ukcp-data-processor/ukcp_dp/file_writers/_write_csv_subset.py: var = self.input_data.get_value_label(InputType.VARIABLE)[0].encode("utf-8")
../ukcp-data-processor/ukcp_dp/file_writers/_write_csv_three_map.py: var = self.input_data.get_value_label(InputType.VARIABLE)[0].encode("utf-8")
However, I am seeing header lines in the output file like this:
b'Minimum air temperature anomaly at 1.5m (\xc2\xb0C)',...
But if I remove the .encode("utf-8") I get a sensible header line that isn't an encoded bytestring, e.g.:
Minimum air temperature anomaly at 1.5m (°C),...
This example was generated using this script with a test request:
I have spotted a number of
.encode("utf-8")
usages in the CSV writers, e.g.:However, I am seeing header lines in the output file like this:
But if I remove the
.encode("utf-8")
I get a sensible header line that isn't an encoded bytestring, e.g.:This example was generated using this script with a test request:
Thanks