clj-python / libpython-clj

Python bindings for Clojure
Eclipse Public License 2.0
1.05k stars 68 forks source link

crash on take-jil #194

Closed behrica closed 1 year ago

behrica commented 2 years ago

see here:

https://clojurians.zulipchat.com/#narrow/stream/215609-libpython-clj-dev/topic/JVM.20crash.20on.20'take.20gil'

behrica commented 2 years ago

Repo with reproducible code will follow soon.

behrica commented 2 years ago

A repo with code and instructions on how to reproduce the issue is here: https://github.com/behrica/libpython-clj--194

behrica commented 2 years ago

I am still wondering if this is a bug in libpython-clj or if certain python libraries are incompatible with lib-pythonclj and it's using embedded python

cnuernber commented 2 years ago

Taking a shot in the dark, I turned of automatic gil management in libpython-clj and ran your example w/o using require-python. This is also running with the julia environment which loads the signal handling jvm stub allowing the jvm process to forward signals to python when python has the appropriate signal handler installed.

It worked - there is a lot to dig through here. Note that disabling the gil management means you have to call exactly the java method take-gil once. When doing the java-api I found out that we use check-gil literally many many times per function call. I think there is a race condition in libpython there. With manual gil management we don't ever check the GIL.

user> (System/getProperty "libpython_clj.manual_gil")
"true"
user> (def  train-data  [
                   ["Example sentence belonging to class 1" 1]
                   ["Example sentence belonging to class 0" 0]])

(def eval-data  [
                 ["Example eval sentence belonging to class 1" 1]
                 ["Example eval sentence belonging to class 0" 0]])

#'user/train-data#'user/eval-data
user> (import '[libpython_clj2 python_api])
Execution error (ClassNotFoundException) at java.net.URLClassLoader/findClass (URLClassLoader.java:387).
libpython_clj2.python_api
user> (import '[libpython_clj2 java_api])
libpython_clj2.java_api
user> (java_api/initialize)
Syntax error (IllegalArgumentException) compiling . at (*cider-repl clj-python/libpython-clj-194:localhost:44857(clj)*:61:7).
No matching method initialize found taking 0 args for class libpython_clj2.java_api
user> (java_api/initialize nil)
Mar 21, 2022 7:26:16 AM clojure.tools.logging$eval8515$fn__8518 invoke
INFO: Detecting startup info
Mar 21, 2022 7:26:16 AM clojure.tools.logging$eval8515$fn__8518 invoke
INFO: Startup info {:lib-version "3.9", :java-library-path-addendum "/home/chrisn/miniconda3/lib", :exec-prefix "/home/chrisn/miniconda3", :executable "/home/chrisn/miniconda3/bin/python3", :libnames ("python3.9m" "python3.9"), :prefix "/home/chrisn/miniconda3", :base-prefix "/home/chrisn/miniconda3", :libname "python3.9m", :base-exec-prefix "/home/chrisn/miniconda3", :python-home "/home/chrisn/miniconda3", :version [3 9 1], :platform "linux"}
Mar 21, 2022 7:26:16 AM clojure.tools.logging$eval8515$fn__8518 invoke
INFO: Prefixing java library path: /home/chrisn/miniconda3/lib
Mar 21, 2022 7:26:17 AM clojure.tools.logging$eval8515$fn__8518 invoke
INFO: Loading python library: python3.9
Mar 21, 2022 7:26:17 AM clojure.tools.logging$eval8515$fn__8518 invoke
INFO: Reference thread starting
:ok
user> (java_api/lockGIL)
1
user> (require '[libpython-clj2.python :as py])
nil
user> (def clsmod (py/import-module "simpletransformers.classification"))
#'user/clsmod
user> (def pd (py/import-module "pandas"))
#'user/pd
user> (def train-df (py/call-attr pd "DataFrame" train-data))
#'user/train-df
user> (def eval-df (py/call-attr pd "DataFrame" eval-data))
#'user/eval-df
user> (def model (py/call-attr-kw clsmod "ClassificationModel" ["roberta" "roberta-base"]
                                  {:use_cude false
                                   :args {:use_multiprocessing false
                                          :overwrite_output_dir true
                                          :dataloader_num_workers 1}}))

Downloading:   0%|          | 0.00/481 [00:00<?, ?B/s]
Downloading: 100%|##########| 481/481 [00:00<00:00, 148kB/s]
Execution error at libpython-clj2.python.ffi/check-error-throw (ffi.clj:703).
Traceback (most recent call last):
  File "/home/chrisn/miniconda3/lib/python3.9/site-packages/simpletransformers/classification/classification_model.py", line 361, in __init__
    raise ValueError(
ValueError: 'use_cuda' set to True when cuda is unavailable. Make sure CUDA is available or set use_cuda=False.

user> (def model (py/call-attr-kw clsmod "ClassificationModel" ["roberta" "roberta-base"]
                                  {:use_cuda false
                                   :args {:use_multiprocessing false
                                          :overwrite_output_dir true
                                          :dataloader_num_workers 1}}))

Downloading:   0%|          | 0.00/478M [00:00<?, ?B/s]
Downloading:   0%|          | 929k/478M [00:00<00:52, 9.50MB/s]
Downloading:   0%|          | 1.93M/478M [00:00<00:48, 10.2MB/s]
Downloading:   1%|          | 2.94M/478M [00:00<00:48, 10.3MB/s]
Downloading:   1%|          | 3.92M/478M [00:00<00:48, 10.2MB/s]
Downloading:   1%|1         | 4.90M/478M [00:00<00:48, 10.2MB/s]
Downloading:   1%|1         | 5.94M/478M [00:00<00:47, 10.4MB/s]
Downloading:   1%|1         | 6.94M/478M [00:00<00:47, 10.4MB/s]
Downloading:   2%|1         | 7.96M/478M [00:00<00:47, 10.5MB/s]
Downloading:   2%|1         | 8.96M/478M [00:00<00:50, 9.71MB/s]
Downloading:   2%|2         | 9.95M/478M [00:01<00:49, 9.90MB/s]
Downloading:   2%|2         | 10.9M/478M [00:01<00:49, 9.97MB/s]
Downloading:   3%|2         | 12.0M/478M [00:01<00:47, 10.3MB/s]
Downloading:   3%|2         | 13.0M/478M [00:01<00:47, 10.3MB/s]
Downloading:   3%|2         | 13.9M/478M [00:01<00:47, 10.2MB/s]
Downloading:   3%|3         | 15.0M/478M [00:01<00:46, 10.4MB/s]
Downloading:   3%|3         | 16.0M/478M [00:01<00:46, 10.3MB/s]
Downloading:   4%|3         | 17.0M/478M [00:01<00:47, 10.3MB/s]
Downloading:   4%|3         | 18.0M/478M [00:01<00:46, 10.5MB/s]
Downloading:   4%|3         | 19.0M/478M [00:01<00:46, 10.5MB/s]
Downloading:   4%|4         | 20.0M/478M [00:02<00:46, 10.3MB/s]
Downloading:   4%|4         | 21.0M/478M [00:02<00:46, 10.3MB/s]
Downloading:   5%|4         | 22.0M/478M [00:02<00:46, 10.3MB/s]
Downloading:   5%|4         | 23.0M/478M [00:02<00:45, 10.4MB/s]
Downloading:   5%|5         | 24.0M/478M [00:02<00:46, 10.2MB/s]
Downloading:   5%|5         | 25.0M/478M [00:02<00:45, 10.4MB/s]
Downloading:   5%|5         | 26.0M/478M [00:02<00:46, 10.2MB/s]
Downloading:   6%|5         | 27.0M/478M [00:02<00:45, 10.4MB/s]
Downloading:   6%|5         | 28.0M/478M [00:02<00:46, 10.3MB/s]
Downloading:   6%|6         | 29.0M/478M [00:02<00:45, 10.3MB/s]
Downloading:   6%|6         | 30.1M/478M [00:03<00:44, 10.5MB/s]
Downloading:   7%|6         | 31.1M/478M [00:03<00:44, 10.5MB/s]
Downloading:   7%|6         | 32.1M/478M [00:03<00:44, 10.4MB/s]
Downloading:   7%|6         | 33.1M/478M [00:03<00:45, 10.3MB/s]
Downloading:   7%|7         | 34.1M/478M [00:03<00:44, 10.4MB/s]
Downloading:   7%|7         | 35.1M/478M [00:03<00:44, 10.3MB/s]
Downloading:   8%|7         | 36.1M/478M [00:03<00:44, 10.5MB/s]
Downloading:   8%|7         | 37.1M/478M [00:03<00:44, 10.4MB/s]
Downloading:   8%|7         | 38.1M/478M [00:03<00:45, 10.1MB/s]
Downloading:   8%|8         | 39.1M/478M [00:03<00:44, 10.3MB/s]
Downloading:   8%|8         | 40.1M/478M [00:04<00:44, 10.3MB/s]
Downloading:   9%|8         | 41.1M/478M [00:04<00:44, 10.2MB/s]
Downloading:   9%|8         | 42.2M/478M [00:04<00:43, 10.5MB/s]
Downloading:   9%|9         | 43.2M/478M [00:04<00:43, 10.4MB/s]
Downloading:   9%|9         | 44.2M/478M [00:04<00:44, 10.3MB/s]
Downloading:   9%|9         | 45.2M/478M [00:04<00:43, 10.4MB/s]
Downloading:  10%|9         | 46.2M/478M [00:04<00:43, 10.4MB/s]
Downloading:  10%|9         | 47.2M/478M [00:04<00:43, 10.5MB/s]
Downloading:  10%|#         | 48.2M/478M [00:04<00:43, 10.4MB/s]
Downloading:  10%|#         | 49.2M/478M [00:05<00:43, 10.4MB/s]
Downloading:  11%|#         | 50.2M/478M [00:05<00:42, 10.5MB/s]
Downloading:  11%|#         | 51.2M/478M [00:05<00:42, 10.4MB/s]
Downloading:  11%|#         | 52.2M/478M [00:05<00:42, 10.4MB/s]
Downloading:  11%|#1        | 53.2M/478M [00:05<00:42, 10.4MB/s]
Downloading:  11%|#1        | 54.2M/478M [00:05<00:44, 10.0MB/s]
Downloading:  12%|#1        | 55.1M/478M [00:05<00:44, 9.91MB/s]
Downloading:  12%|#1        | 56.1M/478M [00:05<00:43, 10.1MB/s]
Downloading:  12%|#1        | 57.1M/478M [00:05<00:43, 10.1MB/s]
Downloading:  12%|#2        | 58.1M/478M [00:05<00:43, 10.1MB/s]
Downloading:  12%|#2        | 59.1M/478M [00:06<00:43, 10.1MB/s]
Downloading:  13%|#2        | 60.1M/478M [00:06<00:43, 10.0MB/s]
Downloading:  13%|#2        | 61.1M/478M [00:06<00:42, 10.2MB/s]
Downloading:  13%|#2        | 62.1M/478M [00:06<00:44, 9.88MB/s]
Downloading:  13%|#3        | 63.1M/478M [00:06<00:42, 10.2MB/s]
Downloading:  13%|#3        | 64.1M/478M [00:06<00:42, 10.2MB/s]
Downloading:  14%|#3        | 65.1M/478M [00:06<00:42, 10.3MB/s]
Downloading:  14%|#3        | 66.1M/478M [00:06<00:41, 10.3MB/s]
Downloading:  14%|#4        | 67.1M/478M [00:06<00:41, 10.3MB/s]
Downloading:  14%|#4        | 68.1M/478M [00:06<00:42, 10.2MB/s]
Downloading:  14%|#4        | 69.0M/478M [00:07<00:41, 10.3MB/s]
Downloading:  15%|#4        | 70.1M/478M [00:07<00:40, 10.4MB/s]
Downloading:  15%|#4        | 71.1M/478M [00:07<00:40, 10.4MB/s]
Downloading:  15%|#5        | 72.1M/478M [00:07<00:40, 10.4MB/s]
Downloading:  15%|#5        | 73.1M/478M [00:07<00:40, 10.4MB/s]
Downloading:  15%|#5        | 74.1M/478M [00:07<00:40, 10.3MB/s]
Downloading:  16%|#5        | 75.1M/478M [00:07<00:41, 10.2MB/s]
Downloading:  16%|#5        | 76.0M/478M [00:07<00:41, 10.2MB/s]
Downloading:  16%|#6        | 77.0M/478M [00:07<00:41, 10.2MB/s]
Downloading:  16%|#6        | 78.0M/478M [00:07<00:40, 10.3MB/s]
Downloading:  17%|#6        | 79.0M/478M [00:08<00:40, 10.5MB/s]
Downloading:  17%|#6        | 80.0M/478M [00:08<00:40, 10.4MB/s]
Downloading:  17%|#6        | 81.0M/478M [00:08<00:39, 10.4MB/s]
Downloading:  17%|#7        | 82.0M/478M [00:08<00:40, 10.3MB/s]
Downloading:  17%|#7        | 83.0M/478M [00:08<00:40, 10.3MB/s]
Downloading:  18%|#7        | 84.1M/478M [00:08<00:39, 10.5MB/s]
Downloading:  18%|#7        | 85.1M/478M [00:08<00:39, 10.4MB/s]
Downloading:  18%|#8        | 86.1M/478M [00:08<00:39, 10.3MB/s]
Downloading:  18%|#8        | 87.1M/478M [00:08<00:39, 10.5MB/s]
Downloading:  18%|#8        | 88.1M/478M [00:08<00:39, 10.3MB/s]
Downloading:  19%|#8        | 89.1M/478M [00:09<00:39, 10.4MB/s]
Downloading:  19%|#8        | 90.1M/478M [00:09<00:38, 10.5MB/s]
Downloading:  19%|#9        | 91.1M/478M [00:09<00:38, 10.4MB/s]
Downloading:  19%|#9        | 92.1M/478M [00:09<00:39, 10.3MB/s]
Downloading:  19%|#9        | 93.1M/478M [00:09<00:39, 10.3MB/s]
Downloading:  20%|#9        | 94.1M/478M [00:09<00:38, 10.4MB/s]
Downloading:  20%|#9        | 95.1M/478M [00:09<00:38, 10.3MB/s]
Downloading:  20%|##        | 96.1M/478M [00:09<00:39, 10.3MB/s]
Downloading:  20%|##        | 97.1M/478M [00:09<00:38, 10.4MB/s]
Downloading:  21%|##        | 98.2M/478M [00:09<00:38, 10.5MB/s]
Downloading:  21%|##        | 99.2M/478M [00:10<01:03, 6.27MB/s]
Downloading:  21%|##        | 100M/478M [00:10<01:00, 6.52MB/s] 
Downloading:  21%|##1       | 101M/478M [00:10<00:56, 7.02MB/s]
Downloading:  21%|##1       | 102M/478M [00:10<00:51, 7.65MB/s]
Downloading:  21%|##1       | 103M/478M [00:10<00:47, 8.28MB/s]
Downloading:  22%|##1       | 104M/478M [00:10<00:44, 8.85MB/s]
Downloading:  22%|##1       | 105M/478M [00:10<00:41, 9.42MB/s]
Downloading:  22%|##2       | 106M/478M [00:11<00:40, 9.64MB/s]
Downloading:  22%|##2       | 107M/478M [00:11<00:39, 9.88MB/s]
Downloading:  23%|##2       | 108M/478M [00:11<00:39, 9.95MB/s]
Downloading:  23%|##2       | 109M/478M [00:11<00:37, 10.2MB/s]
Downloading:  23%|##2       | 110M/478M [00:11<00:37, 10.3MB/s]
Downloading:  23%|##3       | 111M/478M [00:11<00:37, 10.2MB/s]
Downloading:  23%|##3       | 112M/478M [00:11<00:36, 10.4MB/s]
Downloading:  24%|##3       | 113M/478M [00:11<00:36, 10.4MB/s]
Downloading:  24%|##3       | 114M/478M [00:11<00:36, 10.4MB/s]
Downloading:  24%|##4       | 115M/478M [00:11<00:36, 10.4MB/s]
Downloading:  24%|##4       | 116M/478M [00:12<00:36, 10.3MB/s]
Downloading:  24%|##4       | 117M/478M [00:12<00:36, 10.4MB/s]
Downloading:  25%|##4       | 118M/478M [00:12<00:36, 10.4MB/s]
Downloading:  25%|##4       | 119M/478M [00:12<00:36, 10.4MB/s]
Downloading:  25%|##5       | 120M/478M [00:12<00:36, 10.3MB/s]
Downloading:  25%|##5       | 121M/478M [00:12<00:35, 10.4MB/s]
Downloading:  25%|##5       | 122M/478M [00:12<00:35, 10.4MB/s]
Downloading:  26%|##5       | 123M/478M [00:12<00:37, 10.0MB/s]
Downloading:  26%|##5       | 124M/478M [00:12<00:36, 10.3MB/s]
Downloading:  26%|##6       | 125M/478M [00:12<00:35, 10.3MB/s]
Downloading:  26%|##6       | 126M/478M [00:13<00:35, 10.3MB/s]
Downloading:  27%|##6       | 127M/478M [00:13<00:36, 10.2MB/s]
Downloading:  27%|##6       | 128M/478M [00:13<00:35, 10.2MB/s]
Downloading:  27%|##6       | 129M/478M [00:13<00:35, 10.2MB/s]
Downloading:  27%|##7       | 130M/478M [00:13<00:35, 10.3MB/s]
Downloading:  27%|##7       | 131M/478M [00:13<00:35, 10.3MB/s]
Downloading:  28%|##7       | 132M/478M [00:13<00:35, 10.3MB/s]
Downloading:  28%|##7       | 133M/478M [00:13<00:34, 10.4MB/s]
Downloading:  28%|##7       | 134M/478M [00:13<00:34, 10.3MB/s]
Downloading:  28%|##8       | 135M/478M [00:13<00:34, 10.4MB/s]
Downloading:  28%|##8       | 136M/478M [00:14<00:35, 10.2MB/s]
Downloading:  29%|##8       | 137M/478M [00:14<00:34, 10.4MB/s]
Downloading:  29%|##8       | 138M/478M [00:14<00:34, 10.4MB/s]
Downloading:  29%|##9       | 139M/478M [00:14<00:33, 10.5MB/s]
Downloading:  29%|##9       | 140M/478M [00:14<00:33, 10.5MB/s]
Downloading:  29%|##9       | 141M/478M [00:14<00:33, 10.5MB/s]
Downloading:  30%|##9       | 142M/478M [00:14<00:34, 10.2MB/s]
Downloading:  30%|##9       | 143M/478M [00:14<00:33, 10.5MB/s]
Downloading:  30%|###       | 144M/478M [00:14<00:33, 10.4MB/s]
Downloading:  30%|###       | 145M/478M [00:14<00:33, 10.5MB/s]
Downloading:  31%|###       | 146M/478M [00:15<00:33, 10.4MB/s]
Downloading:  31%|###       | 147M/478M [00:15<00:33, 10.5MB/s]
Downloading:  31%|###       | 148M/478M [00:15<00:33, 10.4MB/s]
Downloading:  31%|###1      | 149M/478M [00:15<00:34, 9.86MB/s]
Downloading:  31%|###1      | 150M/478M [00:15<00:32, 10.5MB/s]
Downloading:  32%|###1      | 151M/478M [00:15<00:32, 10.6MB/s]
Downloading:  32%|###1      | 152M/478M [00:15<00:32, 10.5MB/s]
Downloading:  32%|###2      | 153M/478M [00:15<00:32, 10.4MB/s]
Downloading:  32%|###2      | 154M/478M [00:15<00:32, 10.5MB/s]
Downloading:  32%|###2      | 155M/478M [00:16<00:32, 10.5MB/s]
Downloading:  33%|###2      | 156M/478M [00:16<00:32, 10.4MB/s]
Downloading:  33%|###2      | 157M/478M [00:16<00:32, 10.4MB/s]
Downloading:  33%|###3      | 158M/478M [00:16<00:32, 10.3MB/s]
Downloading:  33%|###3      | 159M/478M [00:16<00:31, 10.5MB/s]
Downloading:  33%|###3      | 160M/478M [00:16<00:31, 10.5MB/s]
Downloading:  34%|###3      | 161M/478M [00:16<00:31, 10.4MB/s]
Downloading:  34%|###3      | 162M/478M [00:16<00:31, 10.4MB/s]
Downloading:  34%|###4      | 163M/478M [00:16<00:31, 10.4MB/s]
Downloading:  34%|###4      | 164M/478M [00:16<00:31, 10.4MB/s]
Downloading:  35%|###4      | 165M/478M [00:17<00:31, 10.4MB/s]
Downloading:  35%|###4      | 166M/478M [00:17<00:31, 10.3MB/s]
Downloading:  35%|###4      | 167M/478M [00:17<00:31, 10.5MB/s]
Downloading:  35%|###5      | 168M/478M [00:17<00:31, 10.4MB/s]
Downloading:  35%|###5      | 169M/478M [00:17<00:31, 10.3MB/s]
Downloading:  36%|###5      | 170M/478M [00:17<00:31, 10.4MB/s]
Downloading:  36%|###5      | 171M/478M [00:17<00:30, 10.5MB/s]
Downloading:  36%|###6      | 172M/478M [00:17<00:31, 10.3MB/s]
Downloading:  36%|###6      | 173M/478M [00:17<00:30, 10.5MB/s]
Downloading:  36%|###6      | 174M/478M [00:17<00:30, 10.5MB/s]
Downloading:  37%|###6      | 175M/478M [00:18<00:30, 10.4MB/s]
Downloading:  37%|###6      | 176M/478M [00:18<00:30, 10.4MB/s]
Downloading:  37%|###7      | 177M/478M [00:18<00:30, 10.2MB/s]
Downloading:  37%|###7      | 178M/478M [00:18<00:30, 10.5MB/s]
Downloading:  37%|###7      | 179M/478M [00:18<00:30, 10.1MB/s]
Downloading:  38%|###7      | 180M/478M [00:18<00:29, 10.5MB/s]
Downloading:  38%|###7      | 181M/478M [00:18<00:29, 10.5MB/s]
Downloading:  38%|###8      | 182M/478M [00:18<00:29, 10.3MB/s]
Downloading:  38%|###8      | 183M/478M [00:18<00:29, 10.5MB/s]
Downloading:  39%|###8      | 184M/478M [00:18<00:29, 10.5MB/s]
Downloading:  39%|###8      | 185M/478M [00:19<00:29, 10.4MB/s]
Downloading:  39%|###8      | 186M/478M [00:19<00:29, 10.4MB/s]
Downloading:  39%|###9      | 187M/478M [00:19<00:29, 10.4MB/s]
Downloading:  39%|###9      | 188M/478M [00:19<00:29, 10.3MB/s]
Downloading:  40%|###9      | 189M/478M [00:19<00:29, 10.3MB/s]
Downloading:  40%|###9      | 190M/478M [00:19<00:28, 10.4MB/s]
Downloading:  40%|####      | 191M/478M [00:19<00:28, 10.5MB/s]
Downloading:  40%|####      | 192M/478M [00:19<00:28, 10.5MB/s]
Downloading:  40%|####      | 193M/478M [00:19<00:28, 10.4MB/s]
Downloading:  41%|####      | 194M/478M [00:19<00:29, 10.2MB/s]
Downloading:  41%|####      | 195M/478M [00:20<00:28, 10.4MB/s]
Downloading:  41%|####1     | 196M/478M [00:20<00:28, 10.3MB/s]
Downloading:  41%|####1     | 197M/478M [00:20<00:28, 10.5MB/s]
Downloading:  42%|####1     | 198M/478M [00:20<00:27, 10.5MB/s]
Downloading:  42%|####1     | 199M/478M [00:20<00:45, 6.44MB/s]
Downloading:  42%|####1     | 200M/478M [00:20<00:39, 7.36MB/s]
Downloading:  42%|####2     | 201M/478M [00:20<00:36, 8.04MB/s]
Downloading:  42%|####2     | 202M/478M [00:21<00:34, 8.47MB/s]
Downloading:  43%|####2     | 203M/478M [00:21<00:31, 9.09MB/s]
Downloading:  43%|####2     | 204M/478M [00:21<00:30, 9.46MB/s]
Downloading:  43%|####2     | 205M/478M [00:21<00:29, 9.74MB/s]
Downloading:  43%|####3     | 206M/478M [00:21<00:28, 9.87MB/s]
Downloading:  43%|####3     | 207M/478M [00:21<00:28, 9.94MB/s]
Downloading:  44%|####3     | 208M/478M [00:21<00:27, 10.2MB/s]
Downloading:  44%|####3     | 209M/478M [00:21<00:27, 10.3MB/s]
Downloading:  44%|####4     | 210M/478M [00:21<00:27, 10.3MB/s]
Downloading:  44%|####4     | 211M/478M [00:21<00:27, 10.3MB/s]
Downloading:  44%|####4     | 212M/478M [00:22<00:27, 10.2MB/s]
Downloading:  45%|####4     | 213M/478M [00:22<00:26, 10.4MB/s]
Downloading:  45%|####4     | 214M/478M [00:22<00:26, 10.4MB/s]
Downloading:  45%|####5     | 215M/478M [00:22<00:26, 10.4MB/s]
Downloading:  45%|####5     | 216M/478M [00:22<00:26, 10.4MB/s]
Downloading:  45%|####5     | 217M/478M [00:22<00:26, 10.3MB/s]
Downloading:  46%|####5     | 218M/478M [00:22<00:26, 10.2MB/s]
Downloading:  46%|####5     | 220M/478M [00:22<00:25, 10.5MB/s]
Downloading:  46%|####6     | 221M/478M [00:22<00:25, 10.5MB/s]
Downloading:  46%|####6     | 222M/478M [00:22<00:25, 10.5MB/s]
Downloading:  47%|####6     | 223M/478M [00:23<00:25, 10.4MB/s]
Downloading:  47%|####6     | 224M/478M [00:23<00:25, 10.3MB/s]
Downloading:  47%|####6     | 225M/478M [00:23<00:25, 10.5MB/s]
Downloading:  47%|####7     | 226M/478M [00:23<00:25, 10.4MB/s]
Downloading:  47%|####7     | 227M/478M [00:23<00:25, 10.3MB/s]
Downloading:  48%|####7     | 228M/478M [00:23<00:25, 10.5MB/s]
Downloading:  48%|####7     | 229M/478M [00:23<00:25, 10.5MB/s]
Downloading:  48%|####8     | 230M/478M [00:23<00:25, 10.4MB/s]
Downloading:  48%|####8     | 231M/478M [00:23<00:25, 10.3MB/s]
Downloading:  48%|####8     | 232M/478M [00:23<00:24, 10.4MB/s]
Downloading:  49%|####8     | 233M/478M [00:24<00:24, 10.4MB/s]
Downloading:  49%|####8     | 234M/478M [00:24<00:24, 10.4MB/s]
Downloading:  49%|####9     | 235M/478M [00:24<00:24, 10.4MB/s]
Downloading:  49%|####9     | 236M/478M [00:24<00:24, 10.4MB/s]
Downloading:  50%|####9     | 237M/478M [00:24<00:24, 10.4MB/s]
Downloading:  50%|####9     | 238M/478M [00:24<00:24, 10.3MB/s]
Downloading:  50%|####9     | 239M/478M [00:24<00:24, 10.3MB/s]
Downloading:  50%|#####     | 240M/478M [00:24<00:23, 10.5MB/s]
Downloading:  50%|#####     | 241M/478M [00:24<00:23, 10.5MB/s]
Downloading:  51%|#####     | 242M/478M [00:24<00:23, 10.4MB/s]
Downloading:  51%|#####     | 243M/478M [00:25<00:23, 10.3MB/s]
Downloading:  51%|#####     | 244M/478M [00:25<00:23, 10.3MB/s]
Downloading:  51%|#####1    | 245M/478M [00:25<00:23, 10.4MB/s]
Downloading:  51%|#####1    | 246M/478M [00:25<00:23, 10.3MB/s]
Downloading:  52%|#####1    | 247M/478M [00:25<00:23, 10.5MB/s]
Downloading:  52%|#####1    | 248M/478M [00:25<00:23, 10.4MB/s]
Downloading:  52%|#####2    | 249M/478M [00:25<00:22, 10.5MB/s]
Downloading:  52%|#####2    | 250M/478M [00:25<00:23, 10.3MB/s]
Downloading:  52%|#####2    | 251M/478M [00:25<00:22, 10.4MB/s]
Downloading:  53%|#####2    | 252M/478M [00:25<00:22, 10.3MB/s]
Downloading:  53%|#####2    | 253M/478M [00:26<00:22, 10.5MB/s]
Downloading:  53%|#####3    | 254M/478M [00:26<00:22, 10.5MB/s]
Downloading:  53%|#####3    | 255M/478M [00:26<00:22, 10.4MB/s]
Downloading:  54%|#####3    | 256M/478M [00:26<00:22, 10.5MB/s]
Downloading:  54%|#####3    | 257M/478M [00:26<00:22, 10.5MB/s]
Downloading:  54%|#####3    | 258M/478M [00:26<00:22, 10.5MB/s]
Downloading:  54%|#####4    | 259M/478M [00:26<00:22, 10.4MB/s]
Downloading:  54%|#####4    | 260M/478M [00:26<00:22, 10.3MB/s]
Downloading:  55%|#####4    | 261M/478M [00:26<00:21, 10.4MB/s]
Downloading:  55%|#####4    | 262M/478M [00:26<00:21, 10.3MB/s]
Downloading:  55%|#####4    | 263M/478M [00:27<00:21, 10.3MB/s]
Downloading:  55%|#####5    | 264M/478M [00:27<00:21, 10.5MB/s]
Downloading:  55%|#####5    | 265M/478M [00:27<00:21, 10.4MB/s]
Downloading:  56%|#####5    | 266M/478M [00:27<00:21, 10.5MB/s]
Downloading:  56%|#####5    | 267M/478M [00:27<00:21, 10.4MB/s]
Downloading:  56%|#####6    | 268M/478M [00:27<00:21, 10.4MB/s]
Downloading:  56%|#####6    | 269M/478M [00:27<00:20, 10.5MB/s]
Downloading:  56%|#####6    | 270M/478M [00:27<00:21, 10.3MB/s]
Downloading:  57%|#####6    | 271M/478M [00:27<00:20, 10.4MB/s]
Downloading:  57%|#####6    | 272M/478M [00:28<00:20, 10.4MB/s]
Downloading:  57%|#####7    | 273M/478M [00:28<00:20, 10.4MB/s]
Downloading:  57%|#####7    | 274M/478M [00:28<00:20, 10.5MB/s]
Downloading:  58%|#####7    | 275M/478M [00:28<00:20, 10.5MB/s]
Downloading:  58%|#####7    | 276M/478M [00:28<00:20, 10.3MB/s]
Downloading:  58%|#####7    | 277M/478M [00:28<00:20, 10.4MB/s]
Downloading:  58%|#####8    | 278M/478M [00:28<00:19, 10.5MB/s]
Downloading:  58%|#####8    | 279M/478M [00:28<00:19, 10.4MB/s]
Downloading:  59%|#####8    | 280M/478M [00:28<00:20, 10.3MB/s]
Downloading:  59%|#####8    | 281M/478M [00:28<00:19, 10.4MB/s]
Downloading:  59%|#####8    | 282M/478M [00:29<00:19, 10.5MB/s]
Downloading:  59%|#####9    | 283M/478M [00:29<00:19, 10.5MB/s]
Downloading:  59%|#####9    | 284M/478M [00:29<00:19, 10.3MB/s]
Downloading:  60%|#####9    | 285M/478M [00:29<00:19, 10.3MB/s]
Downloading:  60%|#####9    | 286M/478M [00:29<00:19, 10.6MB/s]
Downloading:  60%|######    | 287M/478M [00:29<00:19, 10.5MB/s]
Downloading:  60%|######    | 288M/478M [00:29<00:19, 10.5MB/s]
Downloading:  60%|######    | 289M/478M [00:29<00:18, 10.4MB/s]
Downloading:  61%|######    | 290M/478M [00:29<00:18, 10.4MB/s]
Downloading:  61%|######    | 291M/478M [00:29<00:18, 10.3MB/s]
Downloading:  61%|######1   | 292M/478M [00:30<00:19, 10.2MB/s]
Downloading:  61%|######1   | 293M/478M [00:30<00:18, 10.5MB/s]
Downloading:  62%|######1   | 294M/478M [00:30<00:18, 10.4MB/s]
Downloading:  62%|######1   | 295M/478M [00:30<00:18, 10.5MB/s]
Downloading:  62%|######1   | 296M/478M [00:30<00:18, 10.4MB/s]
Downloading:  62%|######2   | 297M/478M [00:30<00:18, 10.3MB/s]
Downloading:  62%|######2   | 298M/478M [00:30<00:18, 10.3MB/s]
Downloading:  63%|######2   | 299M/478M [00:30<00:18, 10.4MB/s]
Downloading:  63%|######2   | 300M/478M [00:30<00:17, 10.4MB/s]
Downloading:  63%|######3   | 301M/478M [00:30<00:17, 10.3MB/s]
Downloading:  63%|######3   | 302M/478M [00:31<00:17, 10.5MB/s]
Downloading:  63%|######3   | 303M/478M [00:31<00:17, 10.3MB/s]
Downloading:  64%|######3   | 304M/478M [00:31<00:17, 10.5MB/s]
Downloading:  64%|######3   | 305M/478M [00:31<00:17, 10.5MB/s]
Downloading:  64%|######4   | 306M/478M [00:31<00:17, 10.4MB/s]
Downloading:  64%|######4   | 307M/478M [00:31<00:17, 10.4MB/s]
Downloading:  64%|######4   | 308M/478M [00:31<00:17, 10.4MB/s]
Downloading:  65%|######4   | 309M/478M [00:31<00:17, 10.4MB/s]
Downloading:  65%|######4   | 310M/478M [00:31<00:16, 10.5MB/s]
Downloading:  65%|######5   | 311M/478M [00:31<00:16, 10.4MB/s]
Downloading:  65%|######5   | 312M/478M [00:32<00:16, 10.4MB/s]
Downloading:  66%|######5   | 313M/478M [00:32<00:16, 10.5MB/s]
Downloading:  66%|######5   | 314M/478M [00:32<00:16, 10.4MB/s]
Downloading:  66%|######5   | 315M/478M [00:32<00:16, 10.5MB/s]
Downloading:  66%|######6   | 316M/478M [00:32<00:16, 10.5MB/s]
Downloading:  66%|######6   | 317M/478M [00:32<00:16, 10.4MB/s]
Downloading:  67%|######6   | 318M/478M [00:32<00:16, 10.4MB/s]
Downloading:  67%|######6   | 319M/478M [00:32<00:15, 10.4MB/s]
Downloading:  67%|######7   | 320M/478M [00:32<00:15, 10.4MB/s]
Downloading:  67%|######7   | 321M/478M [00:32<00:15, 10.3MB/s]
Downloading:  67%|######7   | 322M/478M [00:33<00:15, 10.5MB/s]
Downloading:  68%|######7   | 323M/478M [00:33<00:15, 10.5MB/s]
Downloading:  68%|######7   | 324M/478M [00:33<00:15, 10.4MB/s]
Downloading:  68%|######8   | 325M/478M [00:33<00:15, 10.4MB/s]
Downloading:  68%|######8   | 326M/478M [00:33<00:15, 10.4MB/s]
Downloading:  68%|######8   | 327M/478M [00:33<00:15, 10.3MB/s]
Downloading:  69%|######8   | 328M/478M [00:33<00:15, 10.4MB/s]
Downloading:  69%|######8   | 329M/478M [00:33<00:14, 10.5MB/s]
Downloading:  69%|######9   | 330M/478M [00:33<00:15, 10.1MB/s]
Downloading:  69%|######9   | 331M/478M [00:33<00:15, 10.2MB/s]
Downloading:  70%|######9   | 332M/478M [00:34<00:14, 10.4MB/s]
Downloading:  70%|######9   | 333M/478M [00:34<00:14, 10.3MB/s]
Downloading:  70%|######9   | 334M/478M [00:34<00:14, 10.4MB/s]
Downloading:  70%|#######   | 335M/478M [00:34<00:14, 10.4MB/s]
Downloading:  70%|#######   | 336M/478M [00:34<00:14, 10.4MB/s]
Downloading:  71%|#######   | 337M/478M [00:34<00:14, 10.3MB/s]
Downloading:  71%|#######   | 338M/478M [00:34<00:13, 10.5MB/s]
Downloading:  71%|#######1  | 339M/478M [00:34<00:13, 10.5MB/s]
Downloading:  71%|#######1  | 340M/478M [00:34<00:13, 10.4MB/s]
Downloading:  71%|#######1  | 341M/478M [00:35<00:13, 10.3MB/s]
Downloading:  72%|#######1  | 342M/478M [00:35<00:13, 10.4MB/s]
Downloading:  72%|#######1  | 343M/478M [00:35<00:13, 10.4MB/s]
Downloading:  72%|#######2  | 344M/478M [00:35<00:13, 10.4MB/s]
Downloading:  72%|#######2  | 345M/478M [00:35<00:13, 10.4MB/s]
Downloading:  72%|#######2  | 346M/478M [00:35<00:13, 10.1MB/s]
Downloading:  73%|#######2  | 347M/478M [00:35<00:13, 10.4MB/s]
Downloading:  73%|#######2  | 348M/478M [00:35<00:13, 10.4MB/s]
Downloading:  73%|#######3  | 350M/478M [00:35<00:12, 10.6MB/s]
Downloading:  73%|#######3  | 351M/478M [00:35<00:12, 10.5MB/s]
Downloading:  74%|#######3  | 352M/478M [00:36<00:12, 10.3MB/s]
Downloading:  74%|#######3  | 353M/478M [00:36<00:12, 10.5MB/s]
Downloading:  74%|#######3  | 354M/478M [00:36<00:12, 10.5MB/s]
Downloading:  74%|#######4  | 355M/478M [00:36<00:12, 10.5MB/s]
Downloading:  74%|#######4  | 356M/478M [00:36<00:12, 10.4MB/s]
Downloading:  75%|#######4  | 357M/478M [00:36<00:12, 10.4MB/s]
Downloading:  75%|#######4  | 358M/478M [00:36<00:12, 10.4MB/s]
Downloading:  75%|#######5  | 359M/478M [00:36<00:12, 10.4MB/s]
Downloading:  75%|#######5  | 360M/478M [00:36<00:12, 10.3MB/s]
Downloading:  75%|#######5  | 361M/478M [00:36<00:11, 10.5MB/s]
Downloading:  76%|#######5  | 362M/478M [00:37<00:11, 10.3MB/s]
Downloading:  76%|#######5  | 363M/478M [00:37<00:11, 10.5MB/s]
Downloading:  76%|#######6  | 364M/478M [00:37<00:11, 10.5MB/s]
Downloading:  76%|#######6  | 365M/478M [00:37<00:11, 10.4MB/s]
Downloading:  76%|#######6  | 366M/478M [00:37<00:11, 10.4MB/s]
Downloading:  77%|#######6  | 367M/478M [00:37<00:11, 10.4MB/s]
Downloading:  77%|#######6  | 368M/478M [00:37<00:11, 10.4MB/s]
Downloading:  77%|#######7  | 369M/478M [00:37<00:10, 10.5MB/s]
Downloading:  77%|#######7  | 370M/478M [00:38<00:17, 6.43MB/s]
Downloading:  78%|#######7  | 371M/478M [00:38<00:15, 7.29MB/s]
Downloading:  78%|#######7  | 372M/478M [00:38<00:13, 8.08MB/s]
Downloading:  78%|#######7  | 373M/478M [00:38<00:12, 8.57MB/s]
Downloading:  78%|#######8  | 374M/478M [00:38<00:12, 9.09MB/s]
Downloading:  78%|#######8  | 375M/478M [00:38<00:11, 9.39MB/s]
Downloading:  79%|#######8  | 376M/478M [00:38<00:10, 9.76MB/s]
Downloading:  79%|#######8  | 377M/478M [00:38<00:10, 9.88MB/s]
Downloading:  79%|#######9  | 378M/478M [00:38<00:10, 10.1MB/s]
Downloading:  79%|#######9  | 379M/478M [00:38<00:10, 10.1MB/s]
Downloading:  79%|#######9  | 380M/478M [00:39<00:10, 10.3MB/s]
Downloading:  80%|#######9  | 381M/478M [00:39<00:09, 10.2MB/s]
Downloading:  80%|#######9  | 382M/478M [00:39<00:09, 10.4MB/s]
Downloading:  80%|########  | 383M/478M [00:39<00:09, 10.4MB/s]
Downloading:  80%|########  | 384M/478M [00:39<00:09, 10.4MB/s]
Downloading:  80%|########  | 385M/478M [00:39<00:09, 10.4MB/s]
Downloading:  81%|########  | 386M/478M [00:39<00:09, 10.5MB/s]
Downloading:  81%|########  | 387M/478M [00:39<00:09, 10.4MB/s]
Downloading:  81%|########1 | 388M/478M [00:39<00:09, 10.4MB/s]
Downloading:  81%|########1 | 389M/478M [00:39<00:09, 10.4MB/s]
Downloading:  82%|########1 | 390M/478M [00:40<00:09, 10.3MB/s]
Downloading:  82%|########1 | 391M/478M [00:40<00:08, 10.4MB/s]
Downloading:  82%|########1 | 392M/478M [00:40<00:08, 10.4MB/s]
Downloading:  82%|########2 | 393M/478M [00:40<00:08, 10.4MB/s]
Downloading:  82%|########2 | 394M/478M [00:40<00:08, 10.4MB/s]
Downloading:  83%|########2 | 395M/478M [00:40<00:08, 10.3MB/s]
Downloading:  83%|########2 | 396M/478M [00:40<00:08, 10.3MB/s]
Downloading:  83%|########3 | 397M/478M [00:40<00:08, 10.5MB/s]
Downloading:  83%|########3 | 398M/478M [00:40<00:08, 10.4MB/s]
Downloading:  83%|########3 | 399M/478M [00:40<00:07, 10.4MB/s]
Downloading:  84%|########3 | 400M/478M [00:41<00:07, 10.4MB/s]
Downloading:  84%|########3 | 401M/478M [00:41<00:07, 10.4MB/s]
Downloading:  84%|########4 | 402M/478M [00:41<00:07, 10.4MB/s]
Downloading:  84%|########4 | 403M/478M [00:41<00:07, 10.4MB/s]
Downloading:  84%|########4 | 404M/478M [00:41<00:07, 10.4MB/s]
Downloading:  85%|########4 | 405M/478M [00:41<00:07, 10.4MB/s]
Downloading:  85%|########4 | 406M/478M [00:41<00:07, 10.4MB/s]
Downloading:  85%|########5 | 407M/478M [00:41<00:07, 10.3MB/s]
Downloading:  85%|########5 | 408M/478M [00:41<00:07, 10.4MB/s]
Downloading:  86%|########5 | 409M/478M [00:42<00:06, 10.4MB/s]
Downloading:  86%|########5 | 410M/478M [00:42<00:06, 10.5MB/s]
Downloading:  86%|########5 | 411M/478M [00:42<00:06, 10.4MB/s]
Downloading:  86%|########6 | 412M/478M [00:42<00:06, 10.4MB/s]
Downloading:  86%|########6 | 413M/478M [00:42<00:06, 10.3MB/s]
Downloading:  87%|########6 | 414M/478M [00:42<00:06, 10.5MB/s]
Downloading:  87%|########6 | 415M/478M [00:42<00:06, 10.5MB/s]
Downloading:  87%|########6 | 416M/478M [00:42<00:06, 10.3MB/s]
Downloading:  87%|########7 | 417M/478M [00:42<00:06, 10.4MB/s]
Downloading:  87%|########7 | 418M/478M [00:42<00:06, 10.4MB/s]
Downloading:  88%|########7 | 419M/478M [00:43<00:05, 10.5MB/s]
Downloading:  88%|########7 | 420M/478M [00:43<00:05, 10.5MB/s]
Downloading:  88%|########8 | 421M/478M [00:43<00:05, 10.4MB/s]
Downloading:  88%|########8 | 422M/478M [00:43<00:05, 10.4MB/s]
Downloading:  88%|########8 | 423M/478M [00:43<00:05, 10.3MB/s]
Downloading:  89%|########8 | 424M/478M [00:43<00:05, 10.5MB/s]
Downloading:  89%|########8 | 425M/478M [00:43<00:05, 10.4MB/s]
Downloading:  89%|########9 | 426M/478M [00:43<00:05, 10.5MB/s]
Downloading:  89%|########9 | 427M/478M [00:43<00:05, 10.5MB/s]
Downloading:  90%|########9 | 428M/478M [00:43<00:05, 10.5MB/s]
Downloading:  90%|########9 | 429M/478M [00:44<00:04, 10.4MB/s]
Downloading:  90%|########9 | 430M/478M [00:44<00:04, 10.3MB/s]
Downloading:  90%|######### | 431M/478M [00:44<00:04, 10.3MB/s]
Downloading:  90%|######### | 432M/478M [00:44<00:04, 10.5MB/s]
Downloading:  91%|######### | 433M/478M [00:44<00:04, 10.4MB/s]
Downloading:  91%|######### | 434M/478M [00:44<00:04, 10.5MB/s]
Downloading:  91%|######### | 435M/478M [00:44<00:04, 10.4MB/s]
Downloading:  91%|#########1| 436M/478M [00:44<00:04, 10.4MB/s]
Downloading:  91%|#########1| 437M/478M [00:44<00:04, 10.4MB/s]
Downloading:  92%|#########1| 438M/478M [00:44<00:04, 10.5MB/s]
Downloading:  92%|#########1| 439M/478M [00:45<00:03, 10.4MB/s]
Downloading:  92%|#########2| 440M/478M [00:45<00:03, 10.4MB/s]
Downloading:  92%|#########2| 441M/478M [00:45<00:03, 10.4MB/s]
Downloading:  92%|#########2| 442M/478M [00:45<00:03, 10.4MB/s]
Downloading:  93%|#########2| 443M/478M [00:45<00:03, 10.4MB/s]
Downloading:  93%|#########2| 444M/478M [00:45<00:03, 10.4MB/s]
Downloading:  93%|#########3| 445M/478M [00:45<00:03, 10.4MB/s]
Downloading:  93%|#########3| 446M/478M [00:45<00:03, 10.4MB/s]
Downloading:  94%|#########3| 447M/478M [00:45<00:03, 10.3MB/s]
Downloading:  94%|#########3| 448M/478M [00:45<00:03, 10.3MB/s]
Downloading:  94%|#########3| 449M/478M [00:46<00:02, 10.5MB/s]
Downloading:  94%|#########4| 450M/478M [00:46<00:02, 10.4MB/s]
Downloading:  94%|#########4| 451M/478M [00:46<00:02, 10.5MB/s]
Downloading:  95%|#########4| 452M/478M [00:46<00:02, 10.4MB/s]
Downloading:  95%|#########4| 453M/478M [00:46<00:02, 10.4MB/s]
Downloading:  95%|#########4| 454M/478M [00:46<00:02, 10.3MB/s]
Downloading:  95%|#########5| 455M/478M [00:46<00:02, 10.4MB/s]
Downloading:  95%|#########5| 456M/478M [00:46<00:02, 10.3MB/s]
Downloading:  96%|#########5| 457M/478M [00:46<00:02, 10.5MB/s]
Downloading:  96%|#########5| 458M/478M [00:46<00:02, 10.4MB/s]
Downloading:  96%|#########6| 459M/478M [00:47<00:01, 10.3MB/s]
Downloading:  96%|#########6| 460M/478M [00:47<00:01, 10.4MB/s]
Downloading:  96%|#########6| 461M/478M [00:47<00:01, 10.4MB/s]
Downloading:  97%|#########6| 462M/478M [00:47<00:01, 10.4MB/s]
Downloading:  97%|#########6| 463M/478M [00:47<00:01, 10.3MB/s]
Downloading:  97%|#########7| 464M/478M [00:47<00:01, 10.4MB/s]
Downloading:  97%|#########7| 465M/478M [00:47<00:01, 10.4MB/s]
Downloading:  98%|#########7| 466M/478M [00:47<00:01, 10.3MB/s]
Downloading:  98%|#########7| 467M/478M [00:47<00:01, 10.3MB/s]
Downloading:  98%|#########7| 468M/478M [00:47<00:00, 10.5MB/s]
Downloading:  98%|#########8| 469M/478M [00:48<00:00, 10.5MB/s]
Downloading:  98%|#########8| 470M/478M [00:48<00:01, 6.37MB/s]
Downloading:  99%|#########8| 471M/478M [00:48<00:00, 7.31MB/s]
Downloading:  99%|#########8| 472M/478M [00:48<00:00, 7.92MB/s]
Downloading:  99%|#########8| 473M/478M [00:48<00:00, 8.61MB/s]
Downloading:  99%|#########9| 474M/478M [00:48<00:00, 9.04MB/s]
Downloading:  99%|#########9| 475M/478M [00:48<00:00, 9.38MB/s]
Downloading: 100%|#########9| 476M/478M [00:49<00:00, 9.75MB/s]
Downloading: 100%|#########9| 477M/478M [00:49<00:00, 9.93MB/s]
Downloading: 100%|##########| 478M/478M [00:49<00:00, 10.2MB/s]
Some weights of the model checkpoint at roberta-base were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'lm_head.decoder.weight', 'lm_head.layer_norm.bias', 'lm_head.dense.bias', 'roberta.pooler.dense.weight', 'lm_head.bias', 'lm_head.dense.weight', 'lm_head.layer_norm.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).
Some weights of RobertaForSequenceClassification were not initialized from the model checkpoint at roberta-base and are newly initialized: ['classifier.dense.bias', 'classifier.out_proj.weight', 'classifier.dense.weight', 'classifier.out_proj.bias']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.

Downloading:   0%|          | 0.00/878k [00:00<?, ?B/s]
Downloading:   5%|5         | 44.0k/878k [00:00<00:01, 450kB/s]
Downloading:  21%|##        | 180k/878k [00:00<00:00, 988kB/s] 
Downloading:  90%|########9 | 788k/878k [00:00<00:00, 3.29MB/s]
Downloading: 100%|##########| 878k/878k [00:00<00:00, 2.84MB/s]

Downloading:   0%|          | 0.00/446k [00:00<?, ?B/s]
Downloading:   6%|6         | 28.0k/446k [00:00<00:01, 283kB/s]
Downloading:  40%|###9      | 178k/446k [00:00<00:00, 945kB/s] 
Downloading: 100%|##########| 446k/446k [00:00<00:00, 1.65MB/s]

Downloading:   0%|          | 0.00/1.29M [00:00<?, ?B/s]
Downloading:   3%|3         | 40.0k/1.29M [00:00<00:03, 408kB/s]
Downloading:  13%|#2        | 172k/1.29M [00:00<00:01, 945kB/s] 
Downloading:  56%|#####6    | 748k/1.29M [00:00<00:00, 3.10MB/s]
Downloading: 100%|##########| 1.29M/1.29M [00:00<00:00, 3.73MB/s]
#'user/model
user> (def x (py/call-attr model "train_model" train-df))
/home/chrisn/miniconda3/lib/python3.9/site-packages/simpletransformers/classification/classification_model.py:585: UserWarning: Dataframe headers not specified. Falling back to using column 0 as text and column 1 as labels.
  warnings.warn(

  0%|          | 0/2 [00:00<?, ?it/s]
 50%|#####     | 1/2 [00:00<00:00,  4.96it/s]
 50%|#####     | 1/2 [00:00<00:00,  4.94it/s]
/home/chrisn/miniconda3/lib/python3.9/site-packages/transformers/optimization.py:306: FutureWarning: This implementation of AdamW is deprecated and will be removed in a future version. Use the PyTorch implementation torch.optim.AdamW instead, or set `no_deprecation_warning=True` to disable this warning
  warnings.warn(
Epoch:   0%|          | 0/1 [00:00<?, ?it/s]
Epoch 1 of 1:   0%|          | 0/1 [00:00<?, ?it/s]
Running Epoch 0 of 1:   0%|          | 0/1 [00:00<?, ?it/s]
Epochs 0/1. Running Loss:    0.6930:   0%|          | 0/1 [00:00<?, ?it/s]
Epochs 0/1. Running Loss:    0.6930: 100%|##########| 1/1 [00:01<00:00,  1.68s/it]
Epochs 0/1. Running Loss:    0.6930: 100%|##########| 1/1 [00:01<00:00,  1.69s/it]

Epoch 1 of 1: 100%|##########| 1/1 [00:03<00:00,  3.96s/it]
Epoch 1 of 1: 100%|##########| 1/1 [00:03<00:00,  3.96s/it]
#'user/x
user> (println x)
(1, 0.6929999589920044)
nil
user> (println "finished train")
finished train
nil
user> 
cnuernber commented 2 years ago

Here is the deps.edn file with the jvm opts I used:

{:paths ["src" "resources"]

 :deps {org.clojure/clojure {:mvn/version "1.10.3"}
        clj-python/libpython-clj {:mvn/version "2.018"}}
 :aliases
 {:manual-gil {:jvm-opts ["-Dlibpython_clj.manual_gil=true"]}}}
behrica commented 2 years ago

Looks like good progress. I do not understand enough on python and the GIL, in order to use this correctly, I fear. I do support your hypothesis of race condition somewhere, as the problem disappears the moment I run the clj file via the clj CLI and not via Cider nrepl.

behrica commented 2 years ago

I can confirm that the above code fixes the issue, but I suppose that it is a workaround, is it ?

behrica commented 2 years ago

Just for information. Playing around with the lockGIL I get soetims an python exception:

Traceback (most recent call last):
  File "/opt/conda/envs/st/lib/python3.9/threading.py", line 1486, in _after_fork
    thread._reset_internal_locks(True)
  File "/opt/conda/envs/st/lib/python3.9/threading.py", line 829, in _reset_internal_locks
    self._tstate_lock._at_fork_reinit()
AttributeError: 'NoneType' object has no attribute '_at_fork_reinit'
Exception ignored in: <function _after_fork at 0x7faf5aaf0790>
Traceback (most recent call last):
  File "/opt/conda/envs/st/lib/python3.9/threading.py", line 1486, in _after_fork
    thread._reset_internal_locks(True)
  File "/opt/conda/envs/st/lib/python3.9/threading.py", line 829, in _reset_internal_locks
    self._tstate_lock._at_fork_reinit()
AttributeError: 'NoneType' object has no attribute '_at_fork_reinit'
Exception ignored in: <function _after_fork at 0x7faf5aaf0790>
Traceback (most recent call last):
  File "/opt/conda/envs/st/lib/python3.9/threading.py", line 1486, in _after_fork
    thread._reset_internal_locks(True)
  File "/opt/conda/envs/st/lib/python3.9/threading.py", line 829, in _reset_internal_locks

but code keeps working, so training finshes.

cnuernber commented 2 years ago

That is fascinating. I will follow up on this soon.

behrica commented 1 year ago

Anything new on this ? I have a reproducible crash (using automatic gil management) and a working version with manual GIL management using the Java API ?

behrica commented 1 year ago

Anything new on this ? I have a reproducible crash (using automatic gil management) and a working version with manual GIL management using the Java API ?

behrica commented 1 year ago

I can still not use simpletransformer library with libpython-clj.

jjtolton commented 1 year ago

I'll try to take a look Monday, out of town today. I've used simpletransformers a few times so I'll see if I can get a working setup.

behrica commented 1 year ago

Just add, it hsppens in REPL only. Working in CLJ file

jjtolton commented 1 year ago

I wasn't able to make much progress with this, unfortunately. Very strange. Glad it is working in clj file.

behrica commented 1 year ago

I think @cnuernber was investigating this to a certain level, see comments before.

He suspected "I think there is a race condition in libpython there."

I was hoping that the fcat of having it working with java helps to solve it, as apparnetly the clojure API of libpython-clj does something "wrong" / different.

Not being able to use it from REPL, is of course a bit of a blocker.

behrica commented 1 year ago

This branch contains the working code using the java API and disable automatic GIL handling via properties: https://github.com/behrica/libpython-clj--194/tree/withManualGil

behrica commented 1 year ago

I have now found by random permutation of the code a (nearly) working clj file. https://github.com/behrica/libpython-clj--194/commit/2308c8477b1bdb177f68e788b28584f56c083c5c

This file can be loaded one time successfully in a connected repl via the instructions here: https://github.com/behrica/libpython-clj--194/blob/main/README.md

But repeating step 3. in the same repl, makes it crash again.

behrica commented 1 year ago

The code is very "sensitive" about where the "require" statements are. Replacing for example the ':require' in the ns declaration by the "require" macro makes it crash again.

behrica commented 1 year ago

It works as well to execetute line by line in a repl from Emacs. But only one-time. Re-execute the "(py/with-gil-stack-rc-context ..." makes it crash again.

behrica commented 1 year ago

I think the crash is related to "where" I load the python modules. If I load them outside the with-... it crashes:

(py/import-as "pandas" pd)
(py/import-as "simpletransformers.classification" classification)
(py/with-gil-stack-rc-context .....
...
do something with teh module

If I load them inside it works, but only ones:


(py/with-gil-stack-rc-context ;; without this it crashes

 (py/import-as "pandas" pd)
 (py/import-as "simpletransformers.classification" classification)

..... do something wit the module
behrica commented 1 year ago

Just a detail:

it is not the training of the model which crashes, bu the construcor of teh model (which lodas the model files from internet) So this line is not even needed: x (py. model train_model train-df)]

jjtolton commented 1 year ago

That’s what I found, too. Actually, what I found is training the model does not cause the crash, but rather training the model is what makes the system unstable. For instance, when you train the model and then run (require-python 'os), the repl will crash.

I believe what is happening is that the GIL is not being released by the python code in the expected way. When I check the crash reports, the core dump always happens on trying to acquire the GIL.

On Tue, Sep 6, 2022 at 5:43 AM Carsten Behring @.***> wrote:

Just a detail:

it is not the training of the model which crashes, bu the construcor of teh model (which lodas the model files from internet) So this line is not even needed: x (py. model train_model train-df)]

— Reply to this email directly, view it on GitHub https://github.com/clj-python/libpython-clj/issues/194#issuecomment-1237917264, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACPJX47N7IZTXBBO6HKZ3ITV44G5FANCNFSM5L7454RA . You are receiving this because you commented.Message ID: @.***>

behrica commented 1 year ago

I have it finally fully working.

behrica commented 1 year ago

Just pushed it to https://github.com/behrica/libpython-clj--194/commit/870124c261902f89d75133e8e82a311ab3db7c8f

behrica commented 1 year ago

The usage of "require-python" instead of the other function from the ns libpython-clj2.python to load a module, such as_

made it finally work.

behrica commented 1 year ago

So it looks to me that only "require-python" behaves well, especially on repeated executions in the same repl instance.

While the other three ways to load a module:

"leak" something or similar.

jjtolton commented 1 year ago

That's really interesting and surprising. But now that I think about it, this makes sense -- one of the differences is that require-python uses the Python runtime to load the modules via Python's importlib whereas the others use lower level (I believe) ABIs to load the code. I am guessing that the Python ML code is also using low level ABIs and C code.

jjtolton commented 1 year ago

We should probably document advising new users to steer away from the lower level functions and use the methods in the require namespace unless they have a specific reason to do otherwise.

behrica commented 1 year ago

I verified again, and indeed all the issue starts the moment I switch from this syntax:

working

(py-require/require-python '[simpletransformers.classification
                              :as classification])
(classification/ClassificationModel

             :use_cuda false
             :model_type "bert"
             :model_name "prajjwal1/bert-tiny"
             :args
             (py/->py-dict
              {:num_train_epochs 1
               :use_multiprocessing false
               :overwrite_output_dir true}))

to

crashing

(py/import-as "simpletransformers.classification" classification)
  (py/py. classification ClassificationModel

             :use_cuda false
             :model_type "bert"
             :model_name "prajjwal1/bert-tiny"
             :args
             (py/->py-dict
              {:num_train_epochs 1
               :use_multiprocessing false
               :overwrite_output_dir true}))
jjtolton commented 1 year ago

Great work. Small API note,

(classification/ClassificationModel

             :use_cuda false
             :model_type "bert"
             :model_name "prajjwal1/bert-tiny"
             :args
              {:num_train_epochs 1
               :use_multiprocessing false
               :overwrite_output_dir true})

should work fine, you only need to cast it to py/->py-dict when using the py. py.. p.- macros

jjtolton commented 1 year ago

I'm curious now if multiprocessing works!

behrica commented 1 year ago

Its back ...

I had it reliably working for a while, but now the crash is back, with the same code, same instructions using Docker.

@jjtolton Could you maybe try yourself, with this commit: https://github.com/behrica/libpython-clj--194/commit/870124c261902f89d75133e8e82a311ab3db7c8f

and the instructions here: https://github.com/behrica/libpython-clj--194/blob/main/how_to_reproduce.txt

jjtolton commented 1 year ago

Darn! Yes I will try in a few hours

behrica commented 1 year ago

At least I understand now, why it worked ones.

By "playing" with the code and disable things temporary, for example the training run, I can bring th REPL in a certain state, where then the training works.

jjtolton commented 1 year ago

I had it working once too but I was unable to replicate after I restarted the Docker to confirm. Keep getting this:

#                                                   
# A fatal error has been detected by the Java Runtime Environment:                                       
#                                                   
#  SIGSEGV (0xb) at pc=0x0000560ad3ec0a49, pid=1284, tid=1328                                            
#                                                   
# JRE version: OpenJDK Runtime Environment (11.0.16+8) (build 11.0.16+8-post-Debian-1deb11u1)            
# Java VM: OpenJDK 64-Bit Server VM (11.0.16+8-post-Debian-1deb11u1, mixed mode, sharing, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:                                
# C  [python+0x143a49]  take_gil+0x49

or something similar

jjtolton commented 1 year ago

I tried a slightly different experiment.

#!./venv/bin/python
from clojurebridge import cljbridge
import sys
import time
import threading
import multiprocessing

def start_repl():
    # repl = multiprocessing.Process(target=cljbridge.init_jvm,
    #                         kwargs=dict(start_repl=True,port=12345,bind='0.0.0.0',cider_nrepl_version='0.27.2'))
    repl = threading.Thread(target=cljbridge.init_jvm,
                            kwargs=dict(start_repl=True,port=12345,bind='0.0.0.0',cider_nrepl_version='0.27.2'))
    repl.start()
    return repl

def main():
    print("Starting Clojure REPL process...")
    repl = start_repl()
    return repl

if __name__ == '__main__':
    repl = main()

run with

#!/usr/bin/env bash
function main() {
  conda run --no-capture-output -n st ipython -i run.py
}

main

This way, I can use the Python interpreter and the Clojure REPL at the same time and share data between them. I also wrote some helper Python:

# model_data.py

```clojure 
import os
import simpletransformers.classification
import simpletransformers.classification as classification
os.environ["TOKENIZERS_PARALLELISM"] = "false"
def get_model():
    model = classification.ClassificationModel(
    "roberta", "roberta-base", 
        use_cuda=False, 
        args=dict(
            use_multiprocessing=False,
            overwrite_output_dir=True,
            dataloader_num_workers=1, #
            # n_gpu=0,
            # process_count=1,
            use_multiprocessing_for_evaluation=False

        ))
    return model

def train_model(model, data):
    model.train_model(data)
    return model

then I proceeded to use it like this:

(require-python '[__main__ :as  main :bind-ns true])

  (require-python '[simpletransformers.classification
                    :as classification])
  (require-python '[pandas :as pd])
  (require-python '[simpletransformers.classification.ClassificationModel :as ClassificationModel])

  (def  train-data  [["Example sentence belonging to class 1" 1]
                     ["Example sentence belonging to class 0" 0]])

  (def eval-data  [["Example eval sentence belonging to class 1" 1]
                   ["Example eval sentence belonging to class 0" 0]])
  (require-python '[model_data :as model-data])
  (def train-df (pd/DataFrame train-data))

  ;; in Python-interpreter:
  ;; >>> model = model_data.get_model()

  (py/set-attr! main "train_df" train-df)

  ;; In [3]: train_df
  ;; Out[3]: 
  ;; 0  Example sentence belonging to class 1  1
  ;; 1  Example sentence belonging to class 0  0
  ;; In [4]: model_data.train_model(model, train_df)
  ;; Out[4]: <simpletransformers.classification.classification_model.ClassificationModel at 0x7faec15f6d90>

  (def model (py/get-attr main "model"))

  (require-python 'sys)

I'm not really sure what, if anything, this proves. But it might be a useful technique if you're trying to push ahead with dabbling in the REPL.

I don't know enough of the API code to test the model in Clojure to see if you can at least use it for labeling inputs, but might be an alternative to allow you to enjoy developing in libpython-clj until we can track down the issue

cnuernber commented 1 year ago

A few things. First - require-python uses import-module unless :reload is specified and it does so during the metadata scan.

So if the issue is somehow calling import-module multiple times on the same module the require-python pathway can't work as currently written.

Another issue also went way down this pathway - https://github.com/clj-python/libpython-clj/pull/64 - again working with this pathway didn't fix the issue.

One assumption of the clojure API is automatic gil management. The java API does not have this assumption and this matches more closely how most of the other systems written against the C api work.

The progress I see personally in this issue is that potentially we can narrow this down to multiple import-module calls. One thing I would try next would be something like calling system/gc manually as this will place objects into the dereference queue - my thought is that potentially the torch system doesn't handle module dereference very well or something along those lines.

My suggestion is to focus purely on combinations of import-module, System/gc, and clear-reference-queue.

What if you wrap what you want to do in a separate local python file and load it specifically? My thought is whatever instability exists with the torch system it exists because they have some extremely custom module implementation and the default python interpreter handles their modules the way they expect such as never dereferencing them or something like that and if we put a vanilla python file between us and torch that may in fact be enough.

cnuernber commented 1 year ago

Another possibility is they have some process state in their C bindings that is confused when loaded from the java process. This would be the least ideal and we would have to find and prove it beyond a reasonable doubt to go further but it may be the case.

jjtolton commented 1 year ago

A few things. First - require-python uses import-module unless :reload is specified and it does so during the metadata scan.

So if the issue is somehow calling import-module multiple times on the same module the require-python pathway can't work as currently written.

Another issue also went way down this pathway - #64 - again working with this pathway didn't fix the issue.

One assumption of the clojure API is automatic gil management. The java API does not have this assumption and this matches more closely how most of the other systems written against the C api work.

The progress I see personally in this issue is that potentially we can narrow this down to multiple import-module calls. One thing I would try next would be something like calling system/gc manually as this will place objects into the dereference queue - my thought is that potentially the torch system doesn't handle module dereference very well or something along those lines.

My suggestion is to focus purely on combinations of import-module, System/gc, and clear-reference-queue.

What if you wrap what you want to do in a separate local python file and load it specifically? My thought is whatever instability exists with the torch system it exists because they have some extremely custom module implementation and the default python interpreter handles their modules the way they expect such as never dereferencing them or something like that and if we put a vanilla python file between us and torch that may in fact be enough.

I tried this approach in https://github.com/clj-python/libpython-clj/issues/194#issuecomment-1238437004. This is the first time I've ever seen an instability survive. I even tried this:

#model_data.py
import os
import simpletransformers.classification
import simpletransformers.classification as classification
os.environ["TOKENIZERS_PARALLELISM"] = "false"
def get_model():
    model = classification.ClassificationModel(
    "roberta", "roberta-base", 
        use_cuda=False, 
        args=dict(
            use_multiprocessing=False,
            overwrite_output_dir=True,
            dataloader_num_workers=1, #
            # n_gpu=0,
            # process_count=1,
            use_multiprocessing_for_evaluation=False

        ))
    return model

def train_model(model, data):
    model.train_model(data)
    return model

def whole_thing(data):
    model = get_model()
    train_model(model, data)
    return model

and then in Clojure

    (require-python '[pandas :as pd]
                  '[model_data :as model-data])
  (def  train-data  [["Example sentence belonging to class 1" 1]
                     ["Example sentence belonging to class 0" 0]])
  (def train-df (pd/DataFrame train-data))
  (def model (model-data/whole_thing train-df)) ;; crash

and this really surprised me.

I have no doubt they are doing something very funky with low-level C but as you said, very difficult to prove.

cnuernber commented 1 year ago

wow- yep - even when loading a separate python file we still see the crash. So it is unlikely the import-module system IMO but more related to some process state the jvm sets up that torch is conflicting with. I wonder if torch installs event handlers for some of the memory events such as SIGSEG. The JVM actually causes SIGSEG's that it is supposed to catch but if torch is doing something similar which I can't imagine then that could result in a crash. We ran into this when working with Julia - specifically Julia is only stable when using signal chaining but if I remember I did try this with torch and I still had the same issues.

behrica commented 1 year ago

This is the first time I've ever seen an instability survive. I even tried this:

#model_data.py
import os
import simpletransformers.classification
import simpletransformers.classification as classification
os.environ["TOKENIZERS_PARALLELISM"] = "false"
def get_model():
    model = classification.ClassificationModel(
    "roberta", "roberta-base", 
        use_cuda=False, 
        args=dict(
            use_multiprocessing=False,
            overwrite_output_dir=True,
            dataloader_num_workers=1, #
            # n_gpu=0,
            # process_count=1,
            use_multiprocessing_for_evaluation=False

        ))
    return model

def train_model(model, data):
    model.train_model(data)
    return model

def whole_thing(data):
    model = get_model()
    train_model(model, data)
    return model

and then in Clojure

    (require-python '[pandas :as pd]
                  '[model_data :as model-data])
  (def  train-data  [["Example sentence belonging to class 1" 1]
                     ["Example sentence belonging to class 0" 0]])
  (def train-df (pd/DataFrame train-data))
  (def model (model-data/whole_thing train-df)) ;; crash

and this really surprised me.

This approach is now working for me... No crash, and I get a trained model....

jjtolton commented 1 year ago

From the REPL or clj cli?

behrica commented 1 year ago

Indeed, you are right. Via NREPL it does not work.

(how i tried it before)

Via normal REPL or clj files it always works.

behrica commented 1 year ago

The "embedded libpython" and the connected NREPL make it fail.

jjtolton commented 1 year ago

Is it reliable now via normal REPL and clj files? Really odd then that this is an issue between simpletransformers, libpython-clj, and nREPL. :thinking:

behrica commented 1 year ago

Out of the 4 ways of calling the above same code:

  1. clj file
  2. repl
  3. embeded libpython-clj with cljbridge.load_clojure_file(clj_file="simpletransformers.clj")
  4. embeded libpython-clj via connected nrepl

only 4 is failing.

4) is for me the most convenient as it works well with Docker (and for me Docker is the best way to use libclojure-clj) But others might disagree.

cnuernber commented 1 year ago

any repl activity significantly increases the chances of a gc run. As I stated before, when I was working on Julia everything would work fine until a gc call and I saw nearly the same symptoms - a simple program would work from command line but fail during nrepl activity.

jjtolton commented 1 year ago

That's interesting. Any insight into what is particular about a repl that causes gc issues?

cnuernber commented 1 year ago

It simple causes the GC to run in general. Any GC run will cause issues - that was my point about trying to call (System/gc) to force the issue. The GC causes the sigseg events to fire in some cases but it isn't a GC issue it is an interaction between how the JVM works and the Python or Julia interpreters.