scaife-viewer / backend

Packages and utilities to build Scaife Viewer backends using ATLAS / CTS resolvers
3 stars 2 forks source link

Fix: Ensure that Pandas does not treat "None" as NaN #76

Open jacobwegner opened 8 months ago

jacobwegner commented 8 months ago

Refs https://github.com/scaife-viewer/beyond-translation-site/commit/d4efc2aae249edd8f56f222b0ff694104cf209e8.

The issue was that some English texts had the word None, which Pandas 2 casts to its NaN value. Then, when we generate SQL statements from Pandas, Pandas tries to insert null values.

Read https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html and found keep_default_na; seemed to work, so if we test that, we can use Pandas 2 instead of 1.

jacobwegner commented 8 months ago

@jtauber I think this change is backwards compatible with Pandas 1.x, but I have not done a lot of testing.

I did test this out with Python 3.12 and Pandas 1.5.3 and Pandas 2.2.0 as part of writing up https://github.com/scaife-viewer/beyond-translation-site/pull/196