bjascob / amrlib

A python library that makes AMR parsing, generation and visualization simple.
MIT License
216 stars 33 forks source link

How to control the number of generation? #46

Closed 14H034160212 closed 2 years ago

14H034160212 commented 2 years ago

Hi,

I got one question about how to control the number of generations? For example, when I use the T5 parser to parse the sentence into the AMR graph. I can only get one graph for each sentence. How can I generate five graphs each time? Thanks.

import amrlib
stog = amrlib.load_stog_model()
graphs = stog.parse_sents(['This is a test of the system.', 'This is a second sentence.'])
for graph in graphs:
    print(graph)

Kind regards, Qiming

bjascob commented 2 years ago

The code isn't setup to return multiple graphs for a parse, but it does internally generate them so you can get them if you're willing code a bit. The code that is doing the parsing is the inference class.

If you instantiate this with stog = amrlib.load_stog_model(ret_raw_gen=True, num_beams=5) it will return the the 5 raw graphs generated from the model (see line 86). These will not be in proper AMR format so you need to run the PenmanDeSerializer on them as seen in line 94. Also note that for efficiency, the sentences are presorted by length so you need to de-sort the model output as seen in line 92.

Basically you're going to need to get the raw graphs back and then process them through a modified copy of lines 90 through 103, where the code is modified to return all 5 versions of graphs_final instead of just the first one that correctly de-serializes.

Of course if it's easier, you can just copy the entire class and modify it. If you do this you just need to instantiate with the param model_dir so it knows where to find your model. The call you're using load_stog_model() takes care of calling the this class with model_dir for you.