Some code files I parse include Unicode characters (such as "curly quotes") when I encounter these I get the error: "UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d ..."
I can work around this in my case by reading in the file using encoding="utf-8" and writing out a temp file using default encoding, but it would make more sense to me either to be able to pass the encoding to the CParser init, or to be able to pass string buffers of code or file-like objects as an alternative to just file names. that way I can open the file in what ever way I need without the parser having to guess.
Some code files I parse include Unicode characters (such as "curly quotes") when I encounter these I get the error: "UnicodeDecodeError: 'charmap' codec can't decode byte 0x9d ..."
I can work around this in my case by reading in the file using encoding="utf-8" and writing out a temp file using default encoding, but it would make more sense to me either to be able to pass the encoding to the CParser init, or to be able to pass string buffers of code or file-like objects as an alternative to just file names. that way I can open the file in what ever way I need without the parser having to guess.