Open aytey opened 3 years ago
Well, it's technically a bug, but I don't see why would somebody use a non-utf8 encoding for files? Doctor it hurts. Then don't do it :D
I don't see why would somebody use a non-utf8 encoding for files?
The tool my company builds (and one of the reasons I use cvise
) automatically builds data-driven unit test "wrappers" for C/C++ -- as part of this, we just "gobble up" whatever code our customers have.
Our customers 100% have code that isn't utf8/ascii (the above comes from a support case I was helping with).
I think that adding something like:
(code I wrote) would make cvise
more robust here.
Would you be happy if I made some utility function like safe_open
that wraps open
along with a call to get_file_encoding
)? I'm thinking that something like:
could then use safe_open
vs. "regular" open
.
Thoughts?
Alternatively maybe cvise
should check if a file is openable as utf8 before starting, and "fail early" (e.g., as part of the initial sanity check) if it isn't?
Thank you for your ideas. So you're using a Python package chardet
which seems reasonable to me.
A question: if you have a non-utf8 test-case, what about converting it first to utf8 and then let cvise
run? At the end you can convert it to a given encoding.
The problem I see with safe_open
is that we'll need to use it at various places. Similarly for the writing of the output.
How about if cvise
(as part of the "initial sanity check") validates if the file can be opened as utf8
? If it can't, we have three options:
cvise
will give up"We could actually do all three of these; without 2 or 3, 1 happens. If you have 2, then 3 is redundant. If you have 3, 2 is redundant (and it won't run things like IncludesPass).
Thoughts?
Well, I like doing 2) using chardet
. That's going to help in most cases (hopefully)..
Well, I like doing 2) using
chardet
. That's going to help in most cases (hopefully)..
@andrewvaughanj I've just implemented the first version, can you please take a look?
I can try this next week; should args.to_utf8
have a "sanity check" afterwards?
Well, the conversion does happen before the normal sanity check. Thanks for testing.
Any update about the testing, please?
Any update about the testing, please?
PING :)
Any news about this @andrewvaughanj ?
For:
running
cvise --start-with-pass IncludesPass
gives:(this was after editing
cvise/utils/testing.py
to setGIVEUP_CONSTANT
to 1)and creates:
and