Closed DavyLandman closed 4 years ago
Hi @DavyLandman It is a first attempt to fix this issue. The following picture shows an example of typhonML model that has been defined by the textual editor:
The possible PrimitiveType
s are:
enum PrimitiveDataType returns PrimitiveDataType:
int = 'int' |
bigint = 'bigint' |
string = 'string' |
text = 'text'|
point = "point" |
polygon = "polygon"|
bool = "bool"|
float = "float"|
blob = "blob"|
date = "date" |
datetime = "datetime";
The list of NLP task types are:
@Danny2097 can you confirm?
I'll apply the graphical editor changes and merge issue35
branch to the master one once you confirm that the changes match your needs.
Looks good 👍
The only thing I'm missing is the size of the string field, or is that a separate parameter? (string is like varchar of sql)
@md2manoppello It is my understanding that the list of NlpTasks
are from an old version of the Type system developed by a previous researcher. Since joining the project myself and @DrRaja have modified the type system as the old version has some extensibility issues. Many of the tasks still exist but others have been added, refactored or removed.
I will post an issue regarding the updated type system.
I don't know how much the modification to the type system will impact ML
or QL
but I apologise in advance.
for QL it should be okay, as long as we can freeze it soon.
@md2manoppello @DavyLandman
I have added the revised NlpTaskType following a discussion with @DrRaja. They can be found in this Comment https://github.com/typhon-project/typhonml/issues/37#issuecomment-599497608.
Hello to all. We are refactoring metamodel and language, trying to respect all your considerations/requests. Could this be a solution presented in these figures?
Specifically, note the integer value (optional) in the definition of the strings (line 3 and 12) and the management of the freetext type (line 5).
In this link you can find our tml example: https://www.dropbox.com/s/n4ukdk7c0of7w6c/TmlNewExample.tml?dl=0
Thx for your feedback.
I assuming eng_pol
is a workflow name., from an NLP perspective it looks good 👍 .
@Danny2097 Yes it's the workflow name.
I think it would not make sense to implement. As I would consider the output of a NLP workflow to enrich the field it is attributed too. Using your example the description field would be populated by the output of [eng_pol, CoreferenceResolution]
. If a user required the same text to be processed by another workflow then I think they should specify that as an additional field int the ML model.
entity Product {
id : int
name : string(32)
price : float
description : freetext[eng_pol, CoreferenceResolution]
sentiment : freetext[eng_pol_sa, SentimentAnalysis]
}
...
But you are right by defining freetext
using an array
users could create models like you proposed. The only other way I can think of doing this would be to use a tuple
for freetext
.
entity Product {
id : int
name : string(32)
price : float
description : freetext(eng_pol, CoreferenceResolution)
sentiment : freetext(eng_pol_sa, SentimentAnalysis)
}
...
This would prevent modelers from incorrectly specifying several workflows or NLP tasks and limit them to only specifying a workflowName
and NlpTaskType
when using the freetext
type. I don't know how possible this is or if it is feasible, it is merely a suggestion.
What do you think @DavyLandman, @Bitico and @md2manoppello ?
shouldn't the workflow name be connected to the kind of nlp task?
so for the same freetext
field you can send freetext[Sentiment[eng_pol_sa], CoreferenceResolution[eng_pol]]
?
else you can only do a single kind of nlp action per attribute?
That would make more sense.
Could this be the solution?
Here the complete example: https://www.dropbox.com/s/ifvgdzwmm9dfy3s/TmlNewExample.tml?dl=0
looks good 👍
How about giving string
square brackets as well?
@DavyLandman Sure! no problem for square brackets on string parameter.
@DavyLandman Sure! no problem for square brackets on string parameter.
thanks, just saw that it was in the original request with round brackets, my bad :)
As we discussed in Athens, we should fix the types a user can pick from during the creation of the ML model.
In the meeting we ended up with this set of primitive types: (rascal syntax for bnf like grammars)