get hash endpoint error

Hi,

Hope you are all well !

I tried to get the hash of an abstract and it triggers the following error:

semantic-sh_1                 |  * Tip: There are .env or .flaskenv files present. Do "pip install python-dotenv" to use them.
semantic-sh_1                 |  * Serving Flask app "server" (lazy loading)
semantic-sh_1                 |  * Environment: production
semantic-sh_1                 |    WARNING: This is a development server. Do not use it in a production deployment.
semantic-sh_1                 |    Use a production WSGI server instead.
semantic-sh_1                 |  * Debug mode: off
semantic-sh_1                 | /opt/service/semantic_sh/semantic_sh.py:51: FutureWarning: arrays to stack must be passed as a "sequence" type such as list or tuple. Support for non-sequence iterables such as generators is deprecated as of NumPy 1.16 and will raise an error in the future.
semantic-sh_1                 |   return np.vstack((np.random.normal(0, 1, dim) for i in range(0, key_size)))
semantic-sh_1                 |  * Running on http://0.0.0.0:5001/ (Press CTRL+C to quit)
semantic-sh_1                 | Truncation was not explicitely activated but `max_length` is provided a specific value, please use `truncation=True` to explicitely truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
semantic-sh_1                 | [2020-08-09 06:48:59,687] ERROR in app: Exception on /api/hash [GET]
semantic-sh_1                 | Traceback (most recent call last):
semantic-sh_1                 |   File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 2447, in wsgi_app
semantic-sh_1                 |     response = self.full_dispatch_request()
semantic-sh_1                 |   File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1952, in full_dispatch_request
semantic-sh_1                 |     rv = self.handle_user_exception(e)
semantic-sh_1                 |   File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1821, in handle_user_exception
semantic-sh_1                 |     reraise(exc_type, exc_value, tb)
semantic-sh_1                 |   File "/usr/local/lib/python3.6/dist-packages/flask/_compat.py", line 39, in reraise
semantic-sh_1                 |     raise value
semantic-sh_1                 |   File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1950, in full_dispatch_request
semantic-sh_1                 |     rv = self.dispatch_request()
semantic-sh_1                 |   File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1936, in dispatch_request
semantic-sh_1                 |     return self.view_functions[rule.endpoint](**req.view_args)
semantic-sh_1                 |   File "./server.py", line 20, in generate_hash
semantic-sh_1                 |     return hex(sh.get_hash(txt))
semantic-sh_1                 |   File "/opt/service/semantic_sh/semantic_sh.py", line 88, in get_hash
semantic-sh_1                 |     y = np.matmul(self._proj, enc)
semantic-sh_1                 | ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 768 is different from 300)
semantic-sh_1                 | 51.210.37.251 - - [09/Aug/2020 06:48:59] "GET /api/hash?text=Recent+work+has+demonstrated+substantial+gains+on+many+NLP+tasks+and+benchmarks+by+pre-training+on+a+large+corpus+of+text+followed+by+fine-tuning+on+a+specific+task.+While+typically+task-agnostic+in+architecture%2C+this+method+still+requires+task-specific+fine-tuning+datasets+of+thousands+or+tens+of+thousands+of+examples.+By+contrast%2C+humans+can+generally+perform+a+new+language+task+from+only+a+few+examples+or+from+simple+instructions+-+something+which+current+NLP+systems+still+largely+struggle+to+do.+Here+we+show+that+scaling+up+language+models+greatly+improves+task-agnostic%2C+few-shot+performance%2C+sometimes+even+reaching+competitiveness+with+prior+state-of-the-art+fine-tuning+approaches.+Specifically%2C+we+train+GPT-3%2C+an+autoregressive+language+model+with+175+billion+parameters%2C+10x+more+than+any+previous+non-sparse+language+model%2C+and+test+its+performance+in+the+few-shot+setting.+For+all+tasks%2C+GPT-3+is+applied+without+any+gradient+updates+or+fine-tuning%2C+with+tasks+and+few-shot+demonstrations+specified+purely+via+text+interaction+with+the+model.+GPT-3+achieves+strong+performance+on+many+NLP+datasets%2C+including+translation%2C+question-answering%2C+and+cloze+tasks%2C+as+well+as+several+tasks+that+require+on-the-fly+reasoning+or+domain+adaptation%2C+such+as+unscrambling+words%2C+using+a+novel+word+in+a+sentence%2C+or+performing+3-digit+arithmetic.+At+the+same+time%2C+we+also+identify+some+datasets+where+GPT-3%27s+few-shot+learning+still+struggles%2C+as+well+as+some+datasets+where+GPT-3+faces+methodological+issues+related+to+training+on+large+web+corpora.+Finally%2C+we+find+that+GPT-3+can+generate+samples+of+news+articles+which+human+evaluators+have+difficulty+distinguishing+from+articles+written+by+humans.+We+discuss+broader+societal+impacts+of+this+finding+and+of+GPT-3+in+general. HTTP/1.1" 500 -

Any idea how to sort it ? Is it related to the server configuration ?

Cheers, X

The same will trying to add a text.

semantic-sh_1                 | Truncation was not explicitely activated but `max_length` is provided a specific value, please use `truncation=True` to explicitely truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.
semantic-sh_1                 | [2020-08-09 06:53:19,438] ERROR in app: Exception on /api/add [GET]
semantic-sh_1                 | Traceback (most recent call last):
semantic-sh_1                 |   File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 2447, in wsgi_app
semantic-sh_1                 |     response = self.full_dispatch_request()
semantic-sh_1                 |   File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1952, in full_dispatch_request
semantic-sh_1                 |     rv = self.handle_user_exception(e)
semantic-sh_1                 |   File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1821, in handle_user_exception
semantic-sh_1                 |     reraise(exc_type, exc_value, tb)
semantic-sh_1                 |   File "/usr/local/lib/python3.6/dist-packages/flask/_compat.py", line 39, in reraise
semantic-sh_1                 |     raise value
semantic-sh_1                 |   File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1950, in full_dispatch_request
semantic-sh_1                 |     rv = self.dispatch_request()
semantic-sh_1                 |   File "/usr/local/lib/python3.6/dist-packages/flask/app.py", line 1936, in dispatch_request
semantic-sh_1                 |     return self.view_functions[rule.endpoint](**req.view_args)
semantic-sh_1                 |   File "./server.py", line 26, in add
semantic-sh_1                 |     sh.add_document(txt)
semantic-sh_1                 |   File "/opt/service/semantic_sh/semantic_sh.py", line 96, in add_document
semantic-sh_1                 |     h = self.get_hash(txt)
semantic-sh_1                 |   File "/opt/service/semantic_sh/semantic_sh.py", line 88, in get_hash
semantic-sh_1                 |     y = np.matmul(self._proj, enc)
semantic-sh_1                 | ValueError: matmul: Input operand 1 has a mismatch in its core dimension 0, with gufunc signature (n?,k),(k,m?)->(n?,m?) (size 768 is different from 300)
semantic-sh_1                 | 51.210.37.251 - - [09/Aug/2020 06:53:19] "GET /api/add?text=Recent+work+has+demonstrated+substantial+gains+on+many+NLP+tasks+and+benchmarks+by+pre-training+on+a+large+corpus+of+text+followed+by+fine-tuning+on+a+specific+task.+While+typically+task-agnostic+in+architecture%2C+this+method+still+requires+task-specific+fine-tuning+datasets+of+thousands+or+tens+of+thousands+of+examples.+By+contrast%2C+humans+can+generally+perform+a+new+language+task+from+only+a+few+examples+or+from+simple+instructions+-+something+which+current+NLP+systems+still+largely+struggle+to+do.+Here+we+show+that+scaling+up+language+models+greatly+improves+task-agnostic%2C+few-shot+performance%2C+sometimes+even+reaching+competitiveness+with+prior+state-of-the-art+fine-tuning+approaches.+Specifically%2C+we+train+GPT-3%2C+an+autoregressive+language+model+with+175+billion+parameters%2C+10x+more+than+any+previous+non-sparse+language+model%2C+and+test+its+performance+in+the+few-shot+setting.+For+all+tasks%2C+GPT-3+is+applied+without+any+gradient+updates+or+fine-tuning%2C+with+tasks+and+few-shot+demonstrations+specified+purely+via+text+interaction+with+the+model.+GPT-3+achieves+strong+performance+on+many+NLP+datasets%2C+including+translation%2C+question-answering%2C+and+cloze+tasks%2C+as+well+as+several+tasks+that+require+on-the-fly+reasoning+or+domain+adaptation%2C+such+as+unscrambling+words%2C+using+a+novel+word+in+a+sentence%2C+or+performing+3-digit+arithmetic.+At+the+same+time%2C+we+also+identify+some+datasets+where+GPT-3%27s+few-shot+learning+still+struggles%2C+as+well+as+some+datasets+where+GPT-3+faces+methodological+issues+related+to+training+on+large+web+corpora.+Finally%2C+we+find+that+GPT-3+can+generate+samples+of+news+articles+which+human+evaluators+have+difficulty+distinguishing+from+articles+written+by+humans.+We+discuss+broader+societal+impacts+of+this+finding+and+of+GPT-3+in+general. HTTP/1.1" 500 -

Also, won't it be better to use POST method for this 2 endpoints ?

KeremZaman / semantic-sh

get hash endpoint error #5