Specify the snapshot encoding

joseph-roitman / pytest-snapshot

A plugin for snapshot testing with pytest.

MIT License

114 stars 12 forks source link

Specify the snapshot encoding #24

Closed baldychristophe closed 3 years ago

baldychristophe commented 3 years ago

Hello,

Thank you for this great library, it helps me a lot.

However, it would be even better to be able to specify the encoding to read and write the snapshots.

I work on a windows computer and I generate snapshots with special characters that are encoded in charmap, the default encoding for windows (That I cannot override in the code). Then, I build my application and run the tests on remote linux servers which use utf-8 as the default encoding (that I cannot override as well). As a result, the snapshots cannot be read on the linux servers because of the encoding difference and my tests fail. To fix my problem, I would need to be able to force an encoding when reading and writing the snapshots.

Fortunately, I think it can be easily fixed by adding an encoding argument to the read_text here and write_text here and here.

Let me know if you think it can be a good improvement and if you would like a PR from me.

Thank you

joseph-roitman commented 3 years ago

Hi, thanks for the bug report. I think I'll hardcode utf-8 into the code. Do you think there is any reason to make the encoding configurable?

baldychristophe commented 3 years ago

Hi, thank you for the quick response.

I think forcing the encoding to utf-8 would be OK most of the time and it fixes my problem.

However, the most optimal solution for me would be to add a snapshot_encoding argument, defaulting to None, to let the user choose their encoding. This solution would have the benefit to avoid introducing a breaking change for the users who already use this library with a default encoding on their computer different than utf-8. And also the users who don't want to specify an encoding won't see the change.

I made a quick PR to show you, let me know if it is any good?

Thank you very much

joseph-roitman commented 3 years ago

I think defaulting to the system default encoding would not be ideal because then the snapshots become platform specific as happened to you. The package could offer a way to change the encoding from the default of utf-8 though I can't think of a reason why a developer would want this.

In regards to breaking changes I guess I should probably release a stable v1.0.0.

Before doing that, I thought of a similar bug that should be fixed, snapshots currently use platform-specific newlines. This probably didn't bother anyone because git usually translates newlines to platform specific ones on checkout. This bug would appear if a Linux and windows machine shared the same filesystem (e.g. WSL).

Edit: On second thought, the default should be to use platform specific newlines to be compatible with git's default behavior (so nothing needs to be changed).