openai / gpt-2

Code for the paper "Language Models are Unsupervised Multitask Learners"
https://openai.com/blog/better-language-models/
Other
22.57k stars 5.53k forks source link

GPT-2 generates error messages, not actual written language #276

Open ErikUden opened 4 years ago

ErikUden commented 4 years ago

I followed the installation process in Developers.md precisely and am currently in a Docker environment.

When running

python src/generate_longinput_samples.py

(a modified version of the generate_conditional_samples.py file)

with the following settings:

def interact_model( model_name='345M', seed=None, nsamples=3, batch_size=1, length=1500, temperature=1, top_k=0, top_p=1, models_dir='models', ):

(I'd like for it to generate 3 samples with the 345M parameters model. I do not quite understand what the length parameter does, but I just turned it up, since my input text is long.)

I get the following error messages:

WARNING:tensorflow:From src/generate_longinput_samples.py:153: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

2020-11-26 15:53:57.503505: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compile
2020-11-26 15:53:57.513755: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 3500035000 Hz
2020-11-26 15:53:57.518445: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5a4b980 executing computations on platform Host. Devices:
2020-11-26 15:53:57.518520: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
WARNING:tensorflow:From src/generate_longinput_samples.py:154: The name tf.placeholder is deprecated. Please use tf.compat.v1.placeholder instead.

WARNING:tensorflow:From src/generate_longinput_samples.py:156: The name tf.set_random_seed is deprecated. Please use tf.compat.v1.set_random_seed instead.

WARNING:tensorflow:From /gpt-2/src/sample.py:51: The name tf.AUTO_REUSE is deprecated. Please use tf.compat.v1.AUTO_REUSE instead.

WARNING:tensorflow:From /gpt-2/src/model.py:148: The name tf.variable_scope is deprecated. Please use tf.compat.v1.variable_scope instead.

WARNING:tensorflow:From /gpt-2/src/sample.py:64: to_float (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.cast` instead.
WARNING:tensorflow:From /gpt-2/src/sample.py:39: add_dispatch_support.<locals>.wrapper (from tensorflow.python.ops.array_ops) is deprecated and will be rem
rsion.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING:tensorflow:From /gpt-2/src/sample.py:67: multinomial (from tensorflow.python.ops.random_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use `tf.random.categorical` instead.
WARNING:tensorflow:From src/generate_longinput_samples.py:164: The name tf.train.Saver is deprecated. Please use tf.compat.v1.train.Saver instead.

WARNING:tensorflow:From /usr/local/lib/python3.5/dist-packages/tensorflow/python/training/saver.py:1276: checkpoint_exists (from tensorflow.python.training
ent) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
2020-11-26 15:54:06.541773: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar T
a_cpu_global_jit was not set.  If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU.  To confirm that XLA is active
 -ACG|ACG \|{endOFtext|>}

Now here's the kicker: I do not know how much of that is actually an error message. You see the \|{endOFtext|>} at the end there? This is for defining where the AI starts to generate some text / after one text ended. After this error message there are other error messages, but the next sample that is generated is not Sample 1 (that is where it starts) it is Sample 2. Here's how Sample 2 looks like:

======================================== SAMPLE 2 ========================================
 <|endofline|> <|iso8859-8|> ... </optgroup> <optgroup label="_________________________________text"> <option value="clientbcc;client/bccv3.nop">clientbtcm
;clienta;clientw">v1</option> <option value="clientcac;client/cacv3.nop">clientbcc;client/bccv3.nop</option> <option value="clientcnm;andci/>3</option> <op
tion value="clientdhcp;client/dhcpv4.nop">clientbtcm;clienta;clientw</option> <option value="clientipv4;clientipv4v4;clientipv6,clientipv6v4;clientipvgm;cl
ient/ipv6">clientiegeo1;Andes</option> <option value="clientipv6;clientipv6v6;clientipvgm-client.1860.0;clientipv6:/24246mip6cwdh0n9:/192.168.1.7(Mibed)</o
ption> <option value="clientipv6;clientipv6v6;clientipvimi;clientipv6v6v6v6v6v6v6v6v6v6v6v6v6v6v6v6v6v6v.14+:96809>clientih:l::64674h:poies</option> <strin
g value="pln[" <optgroup label="thermal alternatives">thermal options which keep the device in thermal-cooling mode</option>] ; at the computer floor tempe
rature</string> <optgroup label="thermal reduction.">Thermal reduction: &pp 2</optgroup> </optgroup> <optgroup label="changespace"> <optgroup value="soft">
0-4: close</optgroup> <optgroup value="hard">5-10: close</optgroup> <optgroup value="intermittent">sheets log exclude heat.u.ltxt only</optgroup> <optgroup
 value="folder">/base/crystalmoon/lips/DIR</optgroup> <optgroup value="install-date">/fallout 3.1</optgroup> <optgroup value="font">Open Sans MS

THIS IS NOT AN ERROR MESSAGE. This is the generated output. I do not want to show the input text, but let me reassure you that it is highly literary and does not include any of the characters or words written here.

What I think is happening:

My input text is causing some error messages, GPT-2 then takes those error messages as input as well.

Why I think it is happening:

I copied the text from some external HTML source, meaning it does not only include ASCII characters. I am uncertain what characters GPT-2 can comprehend, but I removed "ä ö ü" and "à á" etc. etc. I removed all non-ASCII characters with the following function: [^\x00-\x7F]+.

I allowed ASCII control characters ([\x00-\x1F]+) to be inside of my input text, as they are valid ASCII characters.

Anyway, if anyone has any idea why this is happening or why I am getting error messages instead of generated text, please let me know!