kasei / perlrdf

Deprecated in favor of the Attean package
26 stars 25 forks source link

no-op encode/decode fixes #160 #161

Open doriantaylor opened 5 years ago

doriantaylor commented 5 years ago

160 fixed for PG, tests pass over here.

kasei commented 5 years ago

This looks mostly fine to me, but I don't see anything failing on my Pg setup using the current code in HEAD. Could you add tests that are fixed by this PR? Or describe in more detail something I should be seeing failing with the existing code? Thanks!

doriantaylor commented 5 years ago

I was using Store::DBI against Postgres and typed in something with an acute accent into a web app and it came back out mangled. The entry also showed up in psql mangled alongside un-mangled entries from a different database, so that's how I knew it wasn't just a display problem.

First I tried commenting out the encode/decode calls and the problem went away. Then I looked at the DBD::Pg docs and the Postgres docs and concluded it was already handling the character sets automatically. I can't speak for MySQL or SQLite cause I was already three hours into this.

As for the test, it looks like all the store tests are centralized, so I didn't want to chance breaking tests for the other drivers. One weird note however: accent-y test content also showed up in psql all mangled, but the test passed. So maybe the test itself has been returning false positives? Again, I was already three hours into a solution and didn't want to open a Pandora's box.

kasei commented 5 years ago

Understood. If the current code is broken (which it sounds like it is), then Pandora's box needs to be opened! :)

I'll look into writing some tests that expose this problem before merging the code.

Thanks!

kasei commented 5 years ago

I see the encoding problems, but I'm afraid I don't know enough about how modern Pg and DBD::Pg work regarding unicode. My attempt at tests for this show what looks like good utf8 encoded bytes coming back from the database, but they're not marked as utf8 strings in perl, so it just ends up looking like a collection of bytes. I'll need to dig into this further to figure out what I'm missing.