Closed Hyperclaw79 closed 6 years ago
Hi @Hyperclaw79,
Can you provide a little bit more context on what you are trying to do?
@patapizza Here's the mail I sent to help@wit.ai:
On Wed, Jan 3, 2018 at 12:36 AM, "Hyperclaw79" <harshith.thota7@gmail.com> wrote:
First of all, I'd like to thank you for this wonderful open source solution for NLP. I am in love with it.
I am developing a WittyMusicBot which fetches song information.
And I have a question: Is it possible to assign priorities to an entity? (I don't mean the roles.)
Small example of my use case:
Let's say I have two inputs:
Information for Numb by Linkin Park
Details about Castle of Glass
In the first case, the detection would be as such:
"Numb by Linkin Park" -> data
"Numb" -> song (composite under data)
"Linkin Park" -> artist (composite under data)
In the second case, the detection would be as such:
"Castle of Glass" -> song
Currently, even the first input gets detected as song instead of as data. So, I would like to make sure that it gets detected as a song only if it is not detected as data beforehand.
I created the data set of song before data and it is lengthy and requires editing all the previously erroneous cases before data was introduced.
So, if a simple ranking method could solve this instead of manually editing my song's dataset, it would be appreciated.
If there is a way to achieve this apart from exhaustive training, please help me figure it out.
Thank you in advance. Hoping for a helpful reply.
~HT
@Hyperclaw79 why are you using composite entities here, as opposed to an artist
entity and a song
entity?
The data
entity you're proposing seems very general and will probably be hard to train, e.g. you want "play [Castle of Glass]" to be song
, but "play [Castle of Eminem]" to be data
(Eminem made a song called Castle).
You can use the API to update your app programmatically and avoid time-consuming manual modifications.
Closing for now, feel free to comment/reopen if I missed something.
@blandinw
The data entity you're proposing seems very general
It isn't as general as it looks. The data
entity gets triggered only when both song
and the artist
are present and using by
(or 's
) as a keyword in between them.
The data entity you're proposing seems very general and will probably be hard to train, e.g. you want "play [Castle of Glass]" to be song, but "play [Castle of Eminem]" to be data (Eminem made a song called Castle).
Also, I think you misunderstood which I can tell from your example. Compare it with this:
"play Castle of Glass" will be detected as song
while "play Castle of Glass by Eminem" will be detected as data
within which Castle of Glass will be detected as song
and Eminem as artist
under the data
entity.
You can use the API to update your app programmatically
Like I've mentioned before, I am using this as built-in NLP for Facebook, so I can't control it from my end. For now I am using workarounds to detect the by
keyword upon receiving user message but this messes up the detection of song and artist when they comprise of multiple words along with irrelevant words like "look for", "details", etc. This is the whole reason why I want to use a trainable NLP.
Several things here:
1/ I still have the same understanding after reading your comment. What is "play Castle of Eminem" supposed to return? My guess would be that you want it to return "Castle" as a song and "Eminem" as an artist. The issue with that is that you would also like "play Castle of Glass" to return "Castle of Glass" as a song. This will be hard to train because "play Castle of Glass" is very similar (to someone or an algorithm that has no pre-existing knowledge of existing songs and artists). You'll probably have to enumerate all songs in your training samples.
I still don't understand the need for a composite entity here, why have the data
entity at all?
2/ Are you using a custom token in Built-in NLP? If so, you can still use our HTTP API to make changes. Built-in NLP only does a GET /message call on your behalf, for convenience.
What is "play Castle of Eminem" supposed to return? My guess would be that you want it to return
"Castle" as a song and "Eminem" as an artist.
No, I strictly plan to limit delimiters to by
and 's
. On the other hand, of
will still be considered part of a song.
why have the data entity at all?
In the case when there is only a song, I want to send the song
entity. In the case where there is song <by> artist
I want it to be detected as data
within which it can be further classified. The main reason to use data
entity is to get hold of the phrase containing by
in it, first. Then further split it to get song and artist. On the other hand, If I use only song
, sometimes, even by
will be taken in as a part of either the song or the artist which would be wrong.
Built-in NLP only does a GET /message call on your behalf, for convenience.
oh I didn't know about this! Thanks.
you can still use our HTTP API to make changes
except that I'm not able to make one simple POST requests to /samples
without getting a generic error like something went wrong
.
Do you have an example of a POST /samples
request that returns a 500 along with the app id? Thanks.
I'm using a python script to generate the url and send the request.
headers = {
'Authorization':'Bearer P4PPKDLH7LUJJXN2YCOKNXYM37IJGKV5',
'Content-Type': 'application/json'
}
with open('samples.json','w+') as f:
f.write(generate_json(tupleList,queryList))
response = requests.post('https://api.wit.ai/samples?v={}'.format(datetime.date.today().strftime("%Y%m%d")),headers=headers,data=generate_json(tupleList,queryList))
And the sample json looks like this:
[
{
"text": "Give me details for Perfect by Ed Sheeran",
"entities": {
"data": [
{
"entities": {
"song": [
{
"value": "Perfect",
"type": "value"
}
],
"artist": [
{
"value": "Ed Sheeran",
"type": "value"
}
]
},
"value": "Perfect by Ed Sheeran",
"type": "value"
}
]
}
}]
@Hyperclaw79 Please see the docs example. There is no data
field. Composite entities need to be specified under subentities
, not entities
(this is not documented).
-edit-: I see from another issue that data
is an entity. In this case it should look like:
{
"entity":"data",
"value":"Perfect by Ed Sheeran",
"subentities":[
{
"entity":"song",
"value":"Perfect"
},
{
"entity":"artist",
"value":"Ed Sheeran"
}]
}
Note that you need to specify start
and end
for each non-trait entity. Composite entities indexes are relative to the entity above.
Ah thanks for this. will try it out.
(this is not documented)
well, that was the problem.
Note that you need to specify
start
andend
for each non-trait entity.
Can you please give me an example of how to use start
and end
?
Edit: And this works. Thank you.
Example of start and end here: https://github.com/wit-ai/wit-api-only-tutorial#add-date-detection
How to obtain the details of all the entity once my order is complete.
Do you want to request a feature, report a bug, or ask a question about wit? Feature
What is the current behavior? No priority or ranking among entities.
What is the expected behavior? Scope to assign priorities to entities such that there is a better control over the detection.
@l5t In response to your mail: I cannot use 2 wit apps because I'm using this wit app as the custom app for Facebook's built-in NLP.