gramener / gramex-nlg

Natural Language Generation for Gramex applications.
Other
24 stars 24 forks source link

Add add_manual_template functionality #8

Closed samrudh closed 4 years ago

samrudh commented 5 years ago

For the below two issues:

https://github.com/gramener/gramex-nlg/issues/7 https://github.com/gramener/gramex-nlg/issues/8

jaidevd commented 5 years ago

Thanks for this, @samrudh

I understand the motivation behind this functionality, but the implementation might have to be done more carefully. For example, using the 'str.replace' method for changing a token into a "variable" is can't be done. Consider the following example.

sentence = 'The value of Pi up to 3 decimal places is 3.142.'
# which _ideally_ gets templatized as follows:
template = """
{% from math import  pi %}
The value of Pi up to 3 decimal places is {{ round(pi, 3) }}
"""

Now we want to convert the token 3 into a variable which can be controlled from the outside. In order to do this, I cannot do template.replace because the token 3 occurs in more than one place!

However, we can do this with spacy. In a spacy doc, every token has a unique ID, regardless of how often the text of that token appears in the document. Therefore, its better to attach a variable template to a spacy token instead of a Python string.

In general, we need a mechanism to attach a template to a spacy document. The nlg.js library solves this problem differently, since there you can select a piece of text in the UI and add a template formula for it - so there is no ambiguity about which substring the template formula replaces. Let me add a similar interface here in the Python module, and then we can use this. Please keep this PR open and I'll update you.

samrudh commented 5 years ago

Makes sense.. so how are thinking about the interface? Something like this?

def add_manual_template(spacy_doc_index, manual_template)

jaidevd commented 5 years ago

Hi @samrudh

Here's the interface from our call:

text = "something"
fh_args = {///}
df

template = templatize(text, df, fh_args)

from nlg import Template
from spacy import load
nlp = load('/')
doc = nlp(text)

tmpl = template.templatize()

doc = nlp('Value of pi up to 3 decimals as 3.412')
template = Template(doc,  df, fh_args)

template.replace(5, 'n_dec')
template.replace(len(doc) -1, 'round(pi, n_dec)')

template.set_variable_value('n_dec', 3)
template.templatize()
'''
{% set n_dec = 3 %}
Value of pi to {{ n_dec }} decimals is {{ round(pi, n_dec) }}
'''

template.render(df=df, n_dec=3)
Value of pi to 3 decimals is 3.412

I'll create an initial stub for the Template class, and we can work on populating the logic then.

jaidevd commented 5 years ago

@samrudh Can you please add me as collaborator to your fork? That way I can push some commits to this PR and we can continue from there.

samrudh commented 5 years ago

Added.. https://github.com/samrudh/gramex-nlg/invitations