HigashiKed / patent_prior-art_search

先行特許検索
0 stars 0 forks source link

ファイル #13

Closed HigashiKed closed 3 years ago

HigashiKed commented 3 years ago

ファイルの数2,680,604 20:00 "count":117718 20:10 "count":126089 20:20 "count":134197

20:40 150272 21:20 164551 21:30 172115 21:40 176014 21:50 183951 22:00 回すプログラム test.py test_reversed.py test_3.py 00:10 5385 00:20 16655 00:30 28235 00:40 39703 00:50 50960

5:15 154617 5:20 162885 5:30 181422

8:10 356410 8:20 370508 8:30 387213 8:40 399981 8:45 409267 8:50 417753 9:30 475244 9:40 484088 9:45 500982 10:00 528533 10:10 542329

13:50 771163 15:15 831213

22:45 1103339 23:15 1136184 23:30 1154471

5:00 1471580 8:00 1619474 8:45 1654101 9:00 1667915 9:10 1674312 9:20 1683830 9:30 1693921 9:40 1702114 9:50 1709971

10:50 1757442 11:00 1766843 処理変更 11:30 1805809 11:40 1822713 12:30 1904140 12:40 1923110 12:50 1944185 13:00 1957069 13:05 1964814 14:40 2076131 14:50 2087083 15:00 2098905 15:10 2110241 15:30 2134955 15:50 2162440 16:00 2180081 16:10 2200016 16:20 2220933 16:40 2252090 17:40 2351318 18:20 2400716 19:00 2485513 19:20 2517179

HigashiKed commented 3 years ago

test.py 39492itまで完了でエラー elasticsearch.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10))

test_reverse 53353itまで完了でエラー socket.timeout: timed out

HigashiKed commented 3 years ago

https://discuss.elastic.co/t/python-script-update-by-query-elasticsearch-doesnt-work/199334/3

HigashiKed commented 3 years ago

test_5 52204it [3:06:08, 6.03it/s]/Users/higashi/Downloads/EP/000000/09/57/43/EP-0095743-B2.xml Traceback (most recent call last): File "test_5.py", line 353, in es.create(index=index_name, id=tmp_body["ucid"],body=json.dumps(tmp_body,indent=2) ) File "/Users/higashi/anaconda3/lib/python3.7/site-packages/elasticsearch/client/utils.py", line 152, in _wrapped return func(*args, params=params, headers=headers, **kwargs) File "/Users/higashi/anaconda3/lib/python3.7/site-packages/elasticsearch/client/init.py", line 334, in create "PUT", path, params=params, headers=headers, body=body File "/Users/higashi/anaconda3/lib/python3.7/site-packages/elasticsearch/transport.py", line 392, in perform_request raise e File "/Users/higashi/anaconda3/lib/python3.7/site-packages/elasticsearch/transport.py", line

HigashiKed commented 3 years ago

test_5 10544it [37:47, 3.90it/s]/Users/higashi/Downloads/EP/000000/09/31/54/EP-0093154-A1.xml Traceback (most recent call last): File "/Users/higashi/anaconda3/lib elasticsearch.exceptions.ConnectionTimeout: ConnectionTimeout caused by - ReadTimeoutError(HTTPConnectionPool(host='localhost', port=9200): Read timed out. (read timeout=10))

HigashiKed commented 3 years ago

test_reverse 109657it [3:56:06, 3.89it/s]/Users/higashi/Downloads/EP/000000/47/25/84/EP-0472584-B1.xml Traceback (most recent call last): elasticsearch.exceptions.RequestError: RequestError(400, 'mapper_parsing_exception', "failed to parse field [claims.lang] of type [text] in document with id 'EP-0472584-B1'. Preview of field's value: '{}'")

HigashiKed commented 3 years ago

全件挿入完了 index =clef_patent