Open viswavi opened 4 years ago
Did you reduce the batch size to train the model?
I did. Reduced batch size to 2 (but training 19 epochs still completed after 3 days without complaint)
This is fine. The reason that output doesn't contain any metadata is because the input doesn't actually have any predicted spans, so you can safely ignore them.
I'm getting the same error later on in predict_n_ary_relations.py
. I used a batch size of 4 for training.
I implemented the fix for the salient mentions as shown above but I run into an error at this stage as well.
Traceback (most recent call last): File "scirex/predictors/predict_n_ary_relations.py", line 109, in <module> predict(argv[1], argv[2], argv[3], argv[4], int(argv[5])) File "scirex/predictors/predict_n_ary_relations.py", line 77, in predict output_res = model.decode_relations(batch) File "/storage/home/lpb5347/scratch/scirex/SciREX/scirex/models/scirex_model.py", line 385, in decode_relations res["n_ary_relation"] = self._cluster_n_ary_relation.decode(output_n_ary_relation) File "/storage/home/lpb5347/scratch/scirex/SciREX/scirex/models/relations/entity_relation.py", line 211, in decode "metadata" : output_dict['metadata']
Is there a resolution to this? I'm struggling with this issue in predict_n_ary_relations.py
with a batch size of 1
@jeremyadamsfisher a stopgap resolution is to just skip these documents here (here's how I did it in my branch)
However, even with this fix, this gives quite different results on the relation prediction metrics for "End-to-End (gold salient clustering)", which may be an issue if you care about this metric. I think the code does not currently match their paper for this particular evaluation.
With this fix, here's the results that I was able to reproduce vs the originally reported results:
On an unrelated note, I added a few metrics (particularly for relation extraction) as well as significance testing scripts in my branch here: https://github.com/viswavi/SciREX, if you find it useful.
Thanks @viswavi!
I trained the baseline SciREX model, and then tried to make predictions for it, but ran into a problem in the step to predict salient mentions. When I tried running:
I got the error:
It seems that one in every so often, the output of the salient mention prediction decoder is missing the
metadata
field.As a stopgap solution, I'm just skipping batches with this issue, but am concerned that I'll accidentally bias my evaluation somehow by doing this.