CenterForTheBuiltEnvironment / clima

The CBE Clima Tool is a web-based application built to support the need of architects and engineers interested in climate-adapted design. It allows users to analyze the climate data of more than 27,500 locations worldwide using the data contained in EPW files.
https://clima.cbe.berkeley.edu
MIT License
61 stars 21 forks source link

seemingly valid EPW file not cannot be uploaded #201

Closed giobetti closed 1 year ago

giobetti commented 1 year ago

Describe the bug A seemingly valid EPW file is not recognized as such

To Reproduce Steps to reproduce the behavior:

  1. Try to load one of the files attached below in Clima
  2. wait for the error message "The file you have uploaded is not an EPW file" to appear

Expected behavior Clima loads the epw file

Screenshots image

Desktop (please complete the following information):

Additional context please find below the file that is causing problems. It also originates from OneBuilding.org. To make things weirder, it can be loaded no problem from the clima map, but once is downloaded directly from climate.org, it doesn't work (link to download > https://climate.onebuilding.org/WMO_Region_3_South_America/BRA_Brazil/PA_Para/BRA_PA_Belem.816800_INMET.zip )

once it is loaded in clima via the map and downloaded using the "download epw" button in the "climate summary tab" all is fine

mccalluc commented 1 year ago

I can reproduce this error: I unzipped the linked file, and these are its contents:

BRA_PA_Belem.816800_INMET.clm
BRA_PA_Belem.816800_INMET.ddy
BRA_PA_Belem.816800_INMET.epw
BRA_PA_Belem.816800_INMET.pvsyst
BRA_PA_Belem.816800_INMET.rain
BRA_PA_Belem.816800_INMET.stat
BRA_PA_Belem.816800_INMET.wea

I then started the application locally, uploaded the .epw in the UI, and saw the error you described.

There was no more detailed error message in the log, so I searched for that error message in the code, and traced the problem to app_select.py:

        except Exception as e:
            # print(e)
            return (
                None,
                None,
                True,
                messages_alert["wrong_extension"],
                "warning",
            )

Uncommenting the print and restarting the application gives us more information:

'utf-8' codec can't decode byte 0xe1 in position 1066: invalid continuation byte

Looking at the file byte-by-byte in that neighborhood:

hexdump -C -s 1060 -n 16 ~/Downloads/BRA_PA_Belem.816800_INMET/BRA_PA_Belem.816800_INMET.epw
00000424  61 64 6f 73 20 62 e1 73  69 63 6f 73 20 72 65 67  |ados b.sicos reg|

Interpreted as UTF-8, these bytes are invalid, but interpreted as latin-1/iso-8859 we get "á".

Most EPW files only contain ASCII, so encoding problems weren't noticed. I'll assume that latin-1/iso-8859 is the correct encoding and make a PR.

(This might have been easier to debug if the error was logged, instead of having the print commented out, but that's subjective.)