glue-viz / glue

Linked Data Visualizations Across Multiple Files
http://glueviz.org
Other
742 stars 153 forks source link

Fix an issue that caused the astropy table reader on Windows to behave differently to other platforms #2519

Closed astrofrog closed 2 months ago

astrofrog commented 2 months ago

On Windows, the default encoding/locale seems to be cp1252 which will read random binary files without complaining, which is not ideal for automatic format recognition.

cc @dhomeier

dhomeier commented 2 months ago

That seems to resolve the test failures; wondering if there still might be text formats that we are not testing here? Seeing the same dev test installation failures over at Astropy now; must be somewhere with anaconda, hopefully only temporary.

dhomeier commented 2 months ago

Finally got the dev jobs through, so this seems to work at least for all tests. I've been wondering if pandas text input might run into similar problems. On macOS I get for the test data

>>> with make_file(data, '.png') as fname:
...     for df in data_factory:
...         print(df.label, df.priority, df.identifier(fname))
...     
FITS file 100 False
HDF5 file 100 False
Numpy save file 100 False
ASCII Table 1 False
FITS table 1 False
VO table 1 False
AASTeX Table 0 False
Auto 0 True
CDS Catalog 0 False
Catalog (astropy.table parser) 0 False
DAOphot Catalog 0 False
Excel 0 False
IPAC Catalog 0 False
Image 0 True
LaTeX Table 0 False
Pandas Table 0 False
SExtractor Catalog 0 False
CASA PPV Cube -1000 False

My interpretation of that is even if Pandas or LaTeX identified as True on Windows, they would not get into the way because Image is tried prior to them. Wondering if with locale.setlocale(locale.LC_ALL, locale=(None, 'utf-8')) could be an alternative that would not let the Table.read() without format fail.

pllim commented 2 months ago

CI is green, let's merge!

dhomeier commented 2 months ago

Yes, I guess if issues with other text formats should pop up, we can always fix them later. Otherwise this will hopefully get us through until 3.15 ;-)