openai / gpt-2-output-dataset

Dataset of GPT-2 outputs for research in detection, biases, and more
MIT License
1.93k stars 548 forks source link

python vs. python3 in line 96 of /detector/server.py #8

Closed AndrewBarfield closed 4 years ago

AndrewBarfield commented 4 years ago

This is a simple problem. Just posting so others are aware.

To get the Web-based GPT-2 Output Detector to work I had to change "python" to "python3" in line 96 of /detector/server.py. See: https://github.com/openai/gpt-2-output-dataset/blob/12459ab3ed239895558beb7063ec95ffc46cd796/detector/server.py#L96

System: OS: Ubuntu 19.10 eoan Kernel: x86_64 Linux 5.3.0-19-generic Uptime: 13d 6h 1m Packages: 2125 Shell: bash 5.0.3 Resolution: 2560x1440 DE: GNOME WM: GNOME Shell WM Theme: Adwaita GTK Theme: Yaru-dark [GTK2/3] Icon Theme: Yaru Font: Ubuntu 11 CPU: Intel Core i7-8809G @ 8x 4.2GHz [27.8°C] GPU: AMD VEGAM (DRM 3.33.0, 5.3.0-19-generic, LLVM 9.0.0) RAM: 6278MiB / 32035MiB

Behavior before the change:

~/Projects/AI/gpt-2-output-dataset/detector$ python3 -m server detector-large.pt Loading checkpoint from detector-large.pt Starting HTTP server on port 8080 Traceback (most recent call last): File "", line 1, in ImportError: No module named torch Traceback (most recent call last): File "/usr/lib/python3.7/runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "/usr/lib/python3.7/runpy.py", line 85, in _run_code exec(code, run_globals) File "/home/drew/Projects/AI/gpt-2-output-dataset/detector/server.py", line 120, in fire.Fire(main) File "/home/drew/.local/lib/python3.7/site-packages/fire/core.py", line 138, in Fire component_trace = _Fire(component, args, parsed_flag_args, context, name) File "/home/drew/.local/lib/python3.7/site-packages/fire/core.py", line 471, in _Fire target=component.name) File "/home/drew/.local/lib/python3.7/site-packages/fire/core.py", line 675, in _CallAndUpdateTrace component = fn(*varargs, kwargs) File "/home/drew/Projects/AI/gpt-2-output-dataset/detector/server.py", line 96, in main num_workers = int(subprocess.check_output(['python', '-c', 'import torch; print(torch.cuda.device_count())'])) File "/usr/lib/python3.7/subprocess.py", line 411, in check_output kwargs).stdout File "/usr/lib/python3.7/subprocess.py", line 512, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['python', '-c', 'import torch; print(torch.cuda.device_count())']' returned non-zero exit status 1.

Behavior after (is as expected)

~/Projects/AI/gpt-2-output-dataset/detector$ python3 -m server detector-large.pt Loading checkpoint from detector-large.pt Starting HTTP server on port 8080 [] Process has started; loading the model ... [] Ready to serve [] "GET / HTTP/1.1" 200 - [] "GET /favicon.ico HTTP/1.1" 200 - [] "GET /?This%20is%20an%20online%20demo%20of%20the%20GPT-2%20output%20detector%20model.%20Enter%20some%20text%20in%20the%20text%20box;%20the%20predicted%20probabilities%20will%20be%20displayed%20below.%20The%20results%20start%20to%20get%20reliable%20after%20around%2050%20tokens. HTTP/1.1" 200 -

jongwook commented 4 years ago

Good catch - thanks! I'll replace that with sys.executable so that it is not dependent on the executable name.

loretoparisi commented 4 years ago

@jongwook even when using sys.executable in case of virtualenv and aliases it will not work. On macOS you will get python, while I'm running with python3. Unfortunately we cannot use argv[0] in this case...

jongwook commented 4 years ago

Hmm.. I haven't thought about the case of virtualenv; in the conda environments that we're using the executable has always been python.

I assume you're not doing multi-GPU training since you're on a Mac, so you may simply use:

if torch.cuda.is_available():
   num_workers = int( torch.cuda.device_count() )

The whole subprocess fiddle was to avoid a CUDA error that may happen in multi-process multi-GPU training (see #13 for details).

loretoparisi commented 4 years ago

@jongwook ah yes that one is a way, thank you! Question. num_workers refers to gpu only? I mean, when in TF I can do like:

N_CPU = multiprocessing.cpu_count()
# OMP_NUM_THREADS controls MKL's intra-op parallelization
# Default to available physical cores
os.environ['OMP_NUM_THREADS'] = str( max(1, N_CPU) )
tf.ConfigProto(
                device_count={ 'GPU' : 1, 'CPU': N_CPU },
                intra_op_parallelism_threads = 0,
                inter_op_parallelism_threads = N_CPU,
                allow_soft_placement=True
            )
config.gpu_options.allow_growth = True
config.gpu_options.per_process_gpu_memory_fraction = 0.6

so that I can use at least 8 core parallelism on macOS, etc.

jongwook commented 4 years ago

Yeah in single-node CPU training you shouldn't need to do multiprocessing, since the multithread capability from the OMP/MKL backend should be sufficient.