ServiceNow / picard

PICARD - Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models. PICARD is a ServiceNow Research project that was started at Element AI.
https://arxiv.org/abs/2109.05093
Apache License 2.0
343 stars 122 forks source link

thrift.py3.exceptions.TransportError #92

Open eyuansu62 opened 2 years ago

eyuansu62 commented 2 years ago

Hi, When I run the 'make eval' command, it comes to the following error. Do you have any idea to fix it up?

File "/app/t5-unc/utils/picard_model_wrapper.py", line 200, in with_picard asyncio.run(_init_picard(), debug=False) File "/opt/conda/lib/python3.7/asyncio/runners.py", line 43, in run return loop.run_until_complete(main) File "/opt/conda/lib/python3.7/asyncio/base_events.py", line 587, in run_until_complete return future.result() File "/app/t5-unc/utils/picard_model_wrapper.py", line 127, in _init_picard await _register_schema(db_id=db_id, db_info=db_info, picard_client=client) File "/app/t5-unc/utils/picard_model_wrapper.py", line 133, in _register_schema await picard_client.registerSQLSchema(db_id, sql_schema) thrift.py3.exceptions.TransportError: (<TransportErrorType.UNKNOWN: 0>, 'Channel is !good()', 0, <TransportOptions.0: 0>)

tscholak commented 2 years ago

Hi, that means that the picard backend is either not running (yet) or unresponsive. Does this happen reproducibly?

david-seekai commented 2 years ago

Ran into the same issue. I wasn't using the make file but running it with

docker run \
    -it \
    --rm \
    --user 13011:13011 \
    -p 8000:8000 \
    --mount type=bind,source=/Users/david/database,target=/database \
    --mount type=bind,source=/Users/david/transformers_cache,target=/transformers_cache \
    --mount type=bind,source=/Users/david/configs,target=/app/configs \
            tscholak/text-to-sql-eval:35f43caadadde292f84e83962fbe5320a65d338f   \
    /bin/bash -c "python seq2seq/serve_seq2seq.py configs/serve.json"

trace

Traceback (most recent call last): File "seq2seq/serve_seq2seq.py", line 151, in main() File "seq2seq/serve_seq2seq.py", line 97, in main model = model_cls_wrapper(AutoModelForSeq2SeqLM).from_pretrained( File "seq2seq/serve_seq2seq.py", line 91, in model_cls=model_cls, picard_args=picard_args, tokenizer=tokenizer File "/app/seq2seq/utils/picard_model_wrapper.py", line 199, in with_picard asyncio.run(_init_picard(), debug=False) File "/opt/conda/lib/python3.7/asyncio/runners.py", line 43, in run return loop.run_until_complete(main) File "/opt/conda/lib/python3.7/asyncio/base_events.py", line 587, in run_until_complete return future.result() File "/app/seq2seq/utils/picard_model_wrapper.py", line 127, in _init_picard await _register_tokenizer(picard_client=client) File "/app/seq2seq/utils/picard_model_wrapper.py", line 145, in _register_tokenizer await picard_client.registerTokenizer(json_str) thrift.py3.exceptions.TransportError: (<TransportErrorType.UNKNOWN: 0>, 'Channel is !good()', 0, <TransportOptions.0: 0>)

eyuansu62 commented 2 years ago

It happens when I use the UnifiedSKG code to train a t5-large model, and I copy the picard_model_wrapper into the UnifiedSKG code. In my view, UnifiedSKG code is same as picard code. So I can not understand why it happened.

tscholak commented 2 years ago

Hi, author here. It is not enough to just copy the picard model wrapper. This wrapper is just one small piece in the picard parsing approach. There are library components and a picard executable as well. In most cases, I recommend using the eval image with your model, that is, rather than trying to take picard and put it somewhere else, bring what you have (e.g. a checkpoint) to this codebase and docker images. For most people, that approach is quicker and easier.

tscholak commented 2 years ago

@david-seekai let me have a look.

david-seekai commented 2 years ago

Great thanks I'm using the GKE Container-Optimized OS.

https://cloud.google.com/container-optimized-os/docs/concepts/features-and-benefits

Let me know if I can help in anyway or if any other info is needed.

eyuansu62 commented 2 years ago

@tscholak yep, I do use the picard eval image as the docker environment. I do try to bring the ckpt to this codebase, but I face a problem. The same ckpt has the different performance between two codebase, for example, 68.5 em in UnifiedSKG, but 66.2 em in this codebase without picard mode. I modified the code to make the input the same between the two codebases. Have you ever met this problem?

tscholak commented 2 years ago

I recommend comparing the generated outputs. if they are the same, then the issue is the evaluation. if they are not the same, then something is different about how they are generated.

eyuansu62 commented 2 years ago

Yep, I compared the generated outputs and they are different. I control the same package version, same input format, same function such as generate and evaluate. But I still do not know why it happened. It is really confusing.

abharga2 commented 2 years ago

Hey! I'm reproducibly getting the same error using make prediction_output

Any ideas?

Tomcatiiii commented 1 year ago

Hello, I fixed this problem by changing time.sleep(1) to 10 in line seq2seq/utils/picard_model_wrapper.py 95. I guess because the child process has not started successfully, the main process connected, so this error will be reported.