openai / gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"
https://openai.com/blog/better-language-models/
Other
22.57k stars 5.53k forks source link

tf_upgrade_v2 fails. #267

Open Pavelrst opened 4 years ago

Pavelrst commented 4 years ago

Hi, I've tried to upgrade the code to tf2 using tf_upgrade_v2 tool and got the following error: Traceback (most recent call last): File "/usr/local/bin/tf_upgrade_v2", line 8, in <module> sys.exit(main()) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/tools/compatibility/tf_upgrade_v2_main.py", line 152, in main args.input_tree, output_tree, args.copy_other_files) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/tools/compatibility/ast_edits.py", line 1050, in process_tree _, l_report, l_errors = self.process_file(input_path, output_path) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/tools/compatibility/ast_edits.py", line 900, in process_file temp_file) File "/usr/local/lib/python3.6/dist-packages/tensorflow_core/tools/compatibility/ast_edits.py", line 958, in process_opened_file lines = in_file.readlines() File "/usr/lib/python3.6/encodings/ascii.py", line 26, in decode return codecs.ascii_decode(input, self.errors)[0] UnicodeDecodeError: 'ascii' codec can't decode byte 0xc2 in position 786: ordinal not in range(128)

The reason for this error is the following line in encoder.py: https://github.com/openai/gpt-2/blob/0574c5708b094bfa0b0f6dfe3fd284d9a045acd9/src/encoder.py#L19

I think that I can bypass this issue by deleting the line, converting the code, and then inserting the line back. Do you think it's a valid solution, or maybe it can break something I didn't think about? Thanks.

mikolasan commented 4 years ago

@Pavelrst, I used tf_upgrade_v2 and didn't face any issue, though I used Python 3.8.1 and TensorFlow 2.3.0

tf_upgrade_v2 --intree src --outtree src2 --reportfile report.txt

Anyway, file encoder.py doesn't use tensorflow module, so it will not be affected by the script, because tf_upgrade_v2 mainly replaces tf with tf.compat.v1 in the code when it sees some specific functions.