ServiceNow / picard

PICARD - Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models. PICARD is a ServiceNow Research project that was started at Element AI.
https://arxiv.org/abs/2109.05093
Apache License 2.0
341 stars 122 forks source link

Custom Dataset Eval error #119

Closed yazdipour closed 1 year ago

yazdipour commented 1 year ago

Hey @tscholak , Wonderful work. I was trying to run eval on my custom dataset but I was getting this error at the end of Batch#1 100%

***** Running Evaluation *****
  Num examples = 133
  Batch size = 1
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 133/133 [1:00:10<00:00, 61.23s/it]

<string>:6: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
wandb: Network error (ConnectionError), entering retry loop.
/bin/bash: line 1:     9 Killed                  python seq2seq/run_seq2seq.py configs/eval_seoss.json
make: *** [Makefile:181: eval_seoss] Error 137

Basically I just want to get a result, with basic configs for now on my machine so I can extend it in future with bigger model and bigger dataset.

The dataset contains only one DB and I tried to mimic spider and run the eval.

yazdipour commented 1 year ago

Also this is the Makefile with my custom command which I am trying to bypass building and use your image with my custom python files: https://github.com/yazdipour/picard/blob/ec84e151e2cb8647f3f196ac3d739e5d805f0741/Makefile#L175

yazdipour commented 1 year ago

Once again I tried it running eval directly inside container:

E0119 00:10:25.066828  4996 GeneratedCodeHelper.cpp:77] invalid message from client in function process
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████| 133/133 [1:15:45<00:00, 120.56s/it]<string>:6: VisibleDeprecationWarning: Creating an ndarray from ragged nested sequences (which is a list-or-tuple of lists-or-tuples-or ndarrays with different lengths or shapes) is deprecated. If you meant to do this, you must specify 'dtype=object' when creating the ndarray.
/opt/conda/lib/python3.7/multiprocessing/semaphore_tracker.py:144: UserWarning: semaphore_tracker: There appear to be 6 leaked semaphores to clean up at shutdown
  len(cache))
yazdipour commented 1 year ago

Well I am going to close the issue. I found that when I am doing same things on a fork of this project I am not facing the same issue. /docu-t5 fork probably added some stuff which solves this issue.