cmusphinx / g2p-seq2seq

G2P with Tensorflow
Other
670 stars 194 forks source link

Python 3 compatibility #47

Closed nigma closed 8 years ago

nigma commented 8 years ago

Make the code run on both Python 2 and 3 from the same codebase.

Please note that this pull request also includes changes from #46, which is a prerequisite for this PR.

Tested with python setup.py test, interactive mode and wordlist decode.

nshmyrev commented 8 years ago

Many thanks you for contribution

I cherry-picked two of the 4 commits you submitted.

I'm sorry, I don't see the point in .editorconfig. Its not something I'd like to keep in repo.

On readline thing, I'd look for more clean solution without compat code.

nigma commented 8 years ago

Sounds good!

I'm afraid some version-specific code will be required to handle the unicode conversion, or at least I didn't come across a clean alternative solution.

On Python 3 the sys.stdin is a text file and the IO wrapper handles the decoding automatically using console codepage.

There is access to the underlying sys.stdin.buffer bytes buffer, but it is block-buffered rather than line-buffered and it won't work well with codecs reader in a REPL mode.

A more suitable method of collecting user input would be using raw_input() on 2.7 and input() on 3 (avoids buffering of print() stdout), but one still needs to convert the input to unicode on 2.7 if the underlying g2p implementation expects that.

One or another solution for a compat layer is used in most of single-codebase packages. In tensor flow this is through six dependency, in flask there is a small _compat.py module, etc.

Let me know if you have some other implementation in mind and I can adjust the pull request.

nshmyrev commented 8 years ago

Six is a good idea since we depend on tensorflow anyway. I've pushed a fix, please verify if everything is ok.

Thanks again!