superscriptjs / superscript

A dialogue engine for creating chat bots
http://superscriptjs.com
MIT License
1.65k stars 209 forks source link

Conversation seem no work #86

Closed botisma closed 9 years ago

botisma commented 9 years ago

Hi,

i have to convert this exemple from rivescript:

+ knock knock
- Who's there?

+ *
% who is there
- <star> who?

+ *
% * who
- LOL! <star>! That's funny!

to superscript :

+ ~emohello [*~2]
- Hi!
- Hi, how are you?
- How are you?
- Hello
- Howdy
- Ola

+ knock knock
- Who is there?

+ *
% who is there?
- <cap> who?

+ *
% * who?
- LOL! <cap>! That's funny!

but here the result:

Welcome to the Telnet server!
Hello 127.0.0.1:51526! Type /quit to disconnect.

You> knock knock

Bot> Who is there?
You> mike

Bot>
You>

Any idea, please?

botisma commented 9 years ago

with this script :

+ ~emohello [*~2]
- Hi!
- Hi, how are you?
- How are you?
- Hello
- Howdy
- Ola

+ knock knock
- Who is there?

% who is there?
+ *
- <cap> who?

% * who?
+ *
- LOL! <cap>! That's funny!

result:

Welcome to the Telnet server!
Hello 127.0.0.1:51539! Type /quit to disconnect.

You> knock knock

Bot> LOL! undefined! That's funny!
You>
silentrob commented 9 years ago

Let me see if I can repo the issue. I just noticed % has regressed in the tests, so that might be causing some of the problem.

botisma commented 9 years ago

here the output of debug:

parse topics
  Normalizer Loaded File +0ms { key: '_sys', file: 'systemessentials.txt' }
  Normalizer Loaded File +22ms { key: '_extra', file: 'substitutes.txt' }
  Normalizer Loaded File +107ms { key: '_contractions', file: 'contractions.txt' }
  Normalizer Loaded File +12ms { key: '_interjections', file: 'interjections.txt' }
  Normalizer Loaded File +36ms { key: '_britsh', file: 'british.txt' }
  Normalizer Loaded File +95ms { key: '_spellfix', file: 'spellfix.txt' }
  Normalizer Loaded File +228ms { key: '_texting', file: 'texting.txt' }
  Normalizer Done Reading Subs +5ms
  Normalizer Done Loading files +0ms
  ParseContents Trigger Found +0ms knock knock
  ParseContents Response: +580ms who is there?
  ParseContents Trigger Found +0ms *
  ParseContents Response: +86ms <cap> who?
  ParseContents Trigger Found +0ms *
  ParseContents Response: +92ms LOL! <cap>! That's funny!
  Parse Sorting Topics +0ms
  Parse Sorting triggers... +0ms
  Parse Analyzing topic random +0ms
  Sort Sorting triggers with priority 0 +0ms
  Sort Totally atomic trigger and 2 words. +0ms
  Sort Has a * wildcard with 2 words. +0ms
  Sort Has a * wildcard with 2 words. +0ms
  Sort ip=0 +0ms
  Parse Sorting Previous Topics +1ms
  Parse Sorting triggers... +0ms
  Sort Sorting reverse triggers for %Previous groups... +0ms
botisma commented 9 years ago

hmmm, seems not really implemented parsecontents.js

i am right?

Edit: hmm seems no ;)

silentrob commented 9 years ago

No it is implemented. The problem seems related to not finding the previous gambit and trigger lookup.

I'm not sure if it is because of how the model is imported or something else.. I Should have a patch for it by tomorrow.

silentrob commented 9 years ago

This line is where it is failing https://github.com/silentrob/superscript/blob/master/lib/topics/topic.js#L241

botisma commented 9 years ago

ok, thanks! i am going to understand better the code for help us more in future ;)

silentrob commented 9 years ago

okay cool.. if you have any questions about the code, just let me know.. always happy to help.

botisma commented 9 years ago

ok seems the issue on import (https://github.com/silentrob/superscript/blob/master/lib/topics/import.js#L81)

silentrob commented 9 years ago

Well if you are still walking though it, the problem is on the script import step. https://github.com/silentrob/superscript/blob/master/lib/topics/import.js#L100-L103 The reply is null for each previous gambit. we are getting closer.

silentrob commented 9 years ago

Okay, that was AWESOME. :+1:

botisma commented 9 years ago

i am not really understand how i can fix that because the hierarchy of json is not the same between gTopics and gPrevTopics:

"gTopics": {
        "random": {
            "dh3sMFO2": {
                "trigger": "knock knock",
                "options": {
                    "isQuestion": false,
                    "qType": false,
                    "qSubType": false,
                    "filter": false
                },
                "reply": {
                    "7ttjeJdH": "who is there?"
                },
                "raw": "knock knock"
            }
        }
    },
    "gPrevTopics": {
        "random": {
            "UPQyl9h7": {
                "U87wHuc3": {
                    "trigger": "(?:.*?)",
                    "reply": {
                        "0": "<cap> who?"
                    },
                    "options": {
                        "isQuestion": false,
                        "qType": false,
                        "qSubType": false,
                        "filter": false
                    }
                }
            },
            "ajfOTy2V": {
                "bHIP0cn0": {
                    "trigger": "(?:.*?)",
                    "reply": {
                        "0": "LOL! <cap>! That's funny!"
                    },
                    "options": {
                        "isQuestion": false,
                        "qType": false,
                        "qSubType": false,
                        "filter": false
                    }
                }
            }
        }
    },
botisma commented 9 years ago

may be UPQyl9h7 and ajfOTy2V should be dh3sMFO2 (or the id of reply like 7ttjeJdH )?

silentrob commented 9 years ago

Argg. That JSON file is unwieldily. I believe the mapping for those are in the "thats" key and the reason for that was to allow many to many. Im tempted to simplify it, but Ill save that exercise for another day.

On a side note. the Web Editor interacts directly with the mongo model and bypasses this step.

silentrob commented 9 years ago

Ya you are right. given what the import is using we are unable to make the correct mapping without using the "thats" key. And we really want to reference the "7ttjeJdH" key on the gambit.

botisma commented 9 years ago

ok thanks. i am try to make a fix or may be rewrite the import for using gSorted if you thing is better.

silentrob commented 9 years ago

My thought is if we can make it work without the "that" and "that_trig" that would be a huge win, and much simpler.

The backstory here is the script (and json file) was the only way to do everything, and it was not scalable so I added the import and new mongo representation, clearly this hasent worked since then, and in all fairness I have been working on the web interface so I didn't notice.

Also, the gSorted is obsolete as well. We sort the topics in mongo too.

botisma commented 9 years ago

ok, here the final result of json we need to generate:

{
    "gTopicFlags": {
        "random": []
    },
    "gTopics": {
        "random": {
            "dh3sMFO2": {
                "trigger": "knock knock",
                "options": {
                    "isQuestion": false,
                    "qType": false,
                    "qSubType": false,
                    "filter": false
                },
                "reply": {
                    "7ttjeJdH": "who is there?"
                },
                "raw": "knock knock"
            }
        }
    },
    "gPrevTopics": {
        "random": {
            "7ttjeJdH": {
                "U87wHuc3": {
                    "trigger": "(?:.*?)",
                    "reply": {
                        "1ttjwJdE": "<cap> who?"
                    },
                    "options": {
                        "isQuestion": false,
                        "qType": false,
                        "qSubType": false,
                        "filter": false
                    }
                }
            },
            "1ttjwJdE": {
                "bHIP0cn0": {
                    "trigger": "(?:.*?)",
                    "reply": {
                        "bd9PgDl0": "LOL! <cap>! That's funny!"
                    },
                    "options": {
                        "isQuestion": false,
                        "qType": false,
                        "qSubType": false,
                        "filter": false
                    }
                }
            }
        }
    },
    "gSorted": {
        "topics": {
            "random": [
                "dh3sMFO2"
            ]
        },
        "thats": {
            "random": [
                "7ttjeJdH",
                "1ttjwJdE"
            ]
        },
        "that_trig": {
            "random": {
                "7ttjeJdH": [
                    "U87wHuc3"
                ],
                "1ttjwJdE": [
                    "bHIP0cn0"
                ]
            }
        }
    },
    "keywords": {},
    "checksums": {
        "./topics/main.ss": "ed980de8c513ad5474e4a856883766fd7c9addb4"
    }
}

I can now imported all stuff in mongodb but when i try the bot i got this issue:

User '127.0.0.1:63538' has connected.

/mybot/node_modules/superscript/node_modules/lemmer/node_modules/node-wordnet/node_modules/es6-shim/es6-shim.js:1073
      return new OrigRegExp(pattern, flags);
             ^
SyntaxError: Invalid regular expression: /^(\b?:.(?:.*\s?)?\b)\s?$/: Nothing to repeat
    at RegExp (<anonymous>)
    at new RegExp (/mybot/node_modules/superscript/node_modules/lemmer/node_modules/node-wordnet/node_modules/es6-shim/es6-shim.js:1073:14)
    at /mybot/node_modules/superscript/lib/topics/topic.js:140:39
    at Object.exports.postParse (/mybot/node_modules/superscript/lib/parse/regexreply.js:232:3)
    at eachGambitHandle (/mybot/node_modules/superscript/lib/topics/topic.js:109:18)
    at /mybot/node_modules/superscript/node_modules/async/lib/async.js:118:13
    at Array.forEach (native)
    at _each (/mybot/node_modules/superscript/node_modules/async/lib/async.js:39:24)
    at Object.async.each (/mybot/node_modules/superscript/node_modules/async/lib/async.js:117:9)
    at done (/mybot/node_modules/superscript/lib/topics/topic.js:251:21)
    at done (/mybot/node_modules/superscript/node_modules/async/lib/async.js:128:19)
    at Promise.<anonymous> (/mybot/node_modules/superscript/node_modules/async/lib/async.js:25:16)
    at Promise.<anonymous> (/mybot/node_modules/mongoose/node_modules/mpromise/lib/promise.js:177:8)
    at Promise.emit (events.js:95:17)
    at Promise.emit (/mybot/node_modules/mongoose/node_modules/mpromise/lib/promise.js:84:38)
    at Promise.fulfill (/mybot/node_modules/mongoose/node_modules/mpromise/lib/promise.js:97:20)

with this chat

You> knock knock

Bot> who is there?
You> mike
Connection closed by foreign host.
botisma commented 9 years ago

the gambits look like :

{
    "_id" : ObjectId("5529d23efc434283623da866"),
    "input" : "(?:.*?)",
    "trigger" : "(\\b?:.(?:.*\\s?)?\\b)\\s?",
    "redirect" : "",
    "replies" : [ 
        "5529d23efc434283623da867"
    ],
    "filter" : "",
    "qSubType" : "",
    "qType" : "",
    "isQuestion" : false,
    "id" : "U87wHuc3",
    "__v" : 0
}

may be trigger need just to set with * and not (\\b?:.(?:.*\\s?)?\\b)\\s?

silentrob commented 9 years ago

Dammit, this is a new one, and related to the lemma change I added last week. Im going load in your JSON and see if I can repo it.

silentrob commented 9 years ago

Ya something funky is happening with that tigger for sure. We convert it to "zerowidthstar" here https://github.com/silentrob/superscript/blob/master/lib/parse/regexreply.js#L42-L44

and than back to (?:.*?) https://github.com/silentrob/superscript/blob/master/lib/parse/regexreply.js#L73

botisma commented 9 years ago

ok with this change we close to get something good ;)

Hello 127.0.0.1:63678! Type /quit to disconnect.

You> knock knock

Bot> who is there?
You> mike

Bot> undefined who?
You> mike

Bot> LOL! undefined! That's funny!

the variable is not captured (mike in this case)

EDIT:

here the final json, i try to generate with my update:

{
    "gTopicFlags": {
        "random": []
    },
    "gTopics": {
        "random": {
            "dh3sMFO2": {
                "trigger": "knock knock",
                "options": {
                    "isQuestion": false,
                    "qType": false,
                    "qSubType": false,
                    "filter": false
                },
                "reply": {
                    "7ttjeJdH": "who is there?"
                },
                "raw": "knock knock"
            }
        }
    },
    "gPrevTopics": {
        "random": {
            "7ttjeJdH": {
                "U87wHuc3": {
                    "trigger": "(?:.*?)",
                    "reply": {
                        "1ttjwJdE": "<cap1> who?"
                    },
                    "raw": "(?:.*?)",
                    "options": {
                        "isQuestion": false,
                        "qType": false,
                        "qSubType": false,
                        "filter": false
                    }
                }
            },
            "1ttjwJdE": {
                "bHIP0cn0": {
                    "trigger": "(?:.*?)",
                    "reply": {
                        "bd9PgDl0": "LOL! <cap1>! That's funny!"
                    },
                    "raw": "(?:.*?)",
                    "options": {
                        "isQuestion": false,
                        "qType": false,
                        "qSubType": false,
                        "filter": false
                    }
                }
            }
        }
    },
    "keywords": {},
    "checksums": {
        "./topics/main.ss": "ed980de8c513ad5474e4a856883766fd7c9addb4"
    }
}

i deleting the gSorted can you tell me if i am not forget something?

silentrob commented 9 years ago

Yep, so unlike rivscript * does not capture. If you change that to () or ~2 it should work.

silentrob commented 9 years ago

Did you manually change the JSON or change the parse to use the correct Reply IDs for the previous Topics?

botisma commented 9 years ago

manually for now, i am try to understand the process before make the change in parse.

silentrob commented 9 years ago

I'm just working thought refactoring out the gSorted now. I think removing code is better than added more.

botisma commented 9 years ago

you mean in parse? because i am starting to code now ;)

silentrob commented 9 years ago

Ya. okay, Ill stop. I would far rather a pull request! And I think you know what is going on now.

botisma commented 9 years ago

ok, i try to make one PR asap ;)

silentrob commented 9 years ago

Try running https://github.com/silentrob/superscript/blob/master/test/continue.js To verify, and I would be stoked!

botisma commented 9 years ago

hmm juste one thing, my final json is :

{
    "gTopicFlags": {
        "random": []
    },
    "gTopics": {
        "random": {
            "dh3sMFO2": {
                "trigger": "knock knock",
                "options": {
                    "isQuestion": false,
                    "qType": false,
                    "qSubType": false,
                    "filter": false
                },
                "reply": {
                    "7ttjeJdH": "who is there?"
                },
                "raw": "knock knock"
            }
        }
    },
    "gPrevTopics": {
        "random": {
            "7ttjeJdH": {
                "U87wHuc3": {
                    "trigger": "(?:.*?)",
                    "reply": {
                        "1ttjwJdE": "*~2 who?"
                    },
                    "options": {
                        "isQuestion": false,
                        "qType": false,
                        "qSubType": false,
                        "filter": false
                    }
                }
            },
            "1ttjwJdE": {
                "bHIP0cn0": {
                    "trigger": "(?:.*?)",
                    "reply": {
                        "bd9PgDl0": "LOL! *~2! That's funny!"
                    },
                    "options": {
                        "isQuestion": false,
                        "qType": false,
                        "qSubType": false,
                        "filter": false
                    }
                }
            }
        }
    },
    "gSorted": {
        "topics": {
            "random": [
                "dh3sMFO2"
            ]
        },
        "thats": {
            "random": [
                "7ttjeJdH",
                "1ttjwJdE"
            ]
        },
        "that_trig": {
            "random": {
                "7ttjeJdH": [
                    "U87wHuc3"
                ],
                "1ttjwJdE": [
                    "bHIP0cn0"
                ]
            }
        }
    },
    "keywords": {},
    "checksums": {
        "./topics/main.ss": "ed980de8c513ad5474e4a856883766fd7c9addb4"
    }
}

but the ouput look :

Bot> who is there?
You> mike

Bot> *~2 who?
You>
silentrob commented 9 years ago

So you are seeing that because the trigger needs to run though https://github.com/silentrob/superscript/blob/master/lib/parse/parsecontents.js#L277 and editing the JSON directly wont work.

silentrob commented 9 years ago

We pre-bake the trigger into a regex here to speed up the execution time in the actually message flow, but we also do a post-regex thing too.. but that is not important for this flow.

killix commented 9 years ago

hey guys! I have just push commit to try to fix this issue. I can add the clean of gSorted stuff. You can see my commit it's WIP but I plan to finish in a few hours!

the parsing work with in your case:

+ knock knock
- who is there?
    + *1
    % who is there?
    - <cap> who?

        + *1
        % <cap> who?
        - LOL! <cap>! That's funny!

and

+ knock knock
- who is there?
- who is?
    + *1
    % who is there?
    - <cap> who?

        + *1
        % <cap> who?
        - LOL! <cap>! That's funny!

in the last case, the conversation is enable only for who is there? see for my plan

silentrob commented 9 years ago

So far so good. I think your TODO matches GH-84 which was what I was originally going in to fix this weekend when I noticed this bug to begin with. Thanks for jumping in! I will review a little closer and make sure we aren't missing anything, also thanks for catching the linting errors.

killix commented 9 years ago

commit

Match with something like:

+ knock knock
- who is there?
- who?
    + *1
    % *
    - <cap> who?

        + *1
        % <cap> who?
        - LOL! <cap>! That's funny!

Are you ok with this syntaxe % *

killix commented 9 years ago

ok, may be is good idea to rewrite the output json, i think generate a data.json look like that:

{

    "topics": {
        "random": {
            "flags":[],
            "keywords":[]
        }
    },

    "gambits": {
        "3pXCsUnP": {
            "topic": "random",
            "trigger": "knock knock",
            "raw": "knock knock",
            "options": {
                "isQuestion": false,
                "qType": false,
                "qSubType": false,
                "filter": false
            },
            "reply": ["IGFf3Afw", "ZaydSu6y"]
        },
        "90kOiHY2": {
            "topic": "random",
            "trigger": "(\\S+(:?\\s+\\S+){0})",
            "raw": "*1",
            "options": {
                "conversations": ["IGFf3Afw", "ZaydSu6y"],
                "isQuestion": false,
                "qType": false,
                "qSubType": false,
                "filter": false
            },
            "reply": ["7qHRSnOs"]
        },
        "plAv7fWD": {
            "topic": "random",
            "trigger": "(\\S+(:?\\s+\\S+){0})",
            "raw": "*1",
            "options": {
                "conversations": ["7qHRSnOs"],
                "isQuestion": false,
                "qType": false,
                "qSubType": false,
                "filter": false
            },
            "reply": ["mOlakKzA"]
        },
    },

    "replys": {
        "IGFf3Afw": "who is there?",
        "ZaydSu6y": "who?",
        "7qHRSnOs": "<cap> who?",
        "mOlakKzA": "LOL! <cap>! That's funny!"
    },

    "checksums": {
        "./topics/main.ss": "645b0d278e08b65d05c0b5543e0039ca07ccda30"
    }

}

what do you think?

silentrob commented 9 years ago

Wow, that JSON is super clean and much more readable now. Internally the Reply becomes a mini-topic that can hold another gambit, but Im not sure if that matters here, as long as we can create conversations behold one reply.

I think the proposed change is fine, and really the JSON is an intermediate format anyway - but this is way nicer that what existed before.

silentrob commented 9 years ago

I think your preposed change with * is okay to start and may even be good enough in the long run. It could be a problem with scripting more comply conversations and you might just want to match on a piece of the reply and not the whole thing.

Also for reference I had explored this idea GH-58. I eventually went a different route, and I think having a many to many reply structure is to hard to grok when scripting.

killix commented 9 years ago

i have almost finish the new import and parse

this kind of script is supported, now :

+ ~emohello [*~2]
- Hi!
- Hi, how are you?
- How are you?
- Hello
- Howdy
- Ola

> topic:system topic_name (who knock)
    + knock knock
    - who is there?
    - who?
        + *1
        // % support 'who is there?' or 'who *' (match with all reply 'who') or '*' (wildcard)
        % who is *
        - <cap> who?

            + *1
            % <cap> who?
            - LOL! <cap>! That's funny!
< topic

the json generated look like:

{
    "topics": {
        "random": {
            "flags": [],
            "keywords": []
        },
        "topic_name": {
            "flags": [
                "system"
            ],
            "keywords": [
                "who",
                "knock"
            ]
        }
    },
    "gambits": {
        "8IOkuO9r": {
            "topic": "random",
            "raw": "~emohello [*~2]",
            "options": {
                "isQuestion": false,
                "qType": false,
                "qSubType": false,
                "filter": false
            },
            "replys": [
                "TnWlkGho",
                "jvRvWqGd",
                "J6El4wLX",
                "k0KKCbKO",
                "icc2zzMR",
                "Q7ylyhmD"
            ],
            "trigger": "~emohello(?:\\s*(\\s?(?:[\\w-]*\\s?){0,2})\\s*|\\s*)"
        },
        "SJl5wHSs": {
            "topic": "topic_name",
            "raw": "knock knock",
            "options": {
                "isQuestion": false,
                "qType": false,
                "qSubType": false,
                "filter": false
            },
            "replys": [
                "Y63FbDWI",
                "h0rb2FWj"
            ],
            "trigger": "knock knock"
        },
        "a1bQsKR1": {
            "topic": "topic_name",
            "raw": "*1",
            "options": {
                "isQuestion": false,
                "qType": false,
                "qSubType": false,
                "filter": false,
                "conversations": [
                    "Y63FbDWI"
                ]
            },
            "replys": [
                "4sMEsU9R"
            ],
            "trigger": "(\\S+(:?\\s+\\S+){0})"
        },
        "ayEDdm0P": {
            "topic": "topic_name",
            "raw": "*1",
            "options": {
                "isQuestion": false,
                "qType": false,
                "qSubType": false,
                "filter": false,
                "conversations": [
                    "4sMEsU9R"
                ]
            },
            "replys": [
                "Ro02OWuO"
            ],
            "trigger": "(\\S+(:?\\s+\\S+){0})"
        }
    },
    "replys": {
        "TnWlkGho": "Hi!",
        "jvRvWqGd": "Hi, how are you?",
        "J6El4wLX": "How are you?",
        "k0KKCbKO": "Hello",
        "icc2zzMR": "Howdy",
        "Q7ylyhmD": "Ola",
        "Y63FbDWI": "who is there?",
        "h0rb2FWj": "who?",
        "4sMEsU9R": "<cap> who?",
        "Ro02OWuO": "LOL! <cap>! That's funny!"
    },
    "checksums": {
        "./topics/main.ss": "860f84e806761b234f0db5f0103c27769213bf4c"
    }
}

Need some test specially around 'raw' field, i think, but all look good ;) I plan to make some clean before PR, so if you have any request/comment let's do it ;)

silentrob commented 9 years ago

Okay, this looks great, and not only fix the bug, but also extend the functionality!

I added the raw field when I ported things over to the mongo models. My thought was, if there was ever an event we wanted to revert back to the script format we could use that field as it is impossible to get the intent of the trigger from that regex.

killix commented 9 years ago

ok but this code makes me confused.

silentrob commented 9 years ago

The trigger gets generated in the mongo save hook I believe, and in the past if we had a trigger we just passed it directly in. (since raw is new as of 5.1)

silentrob commented 9 years ago

Are you and @botisma pairing on this issue? Just noticed the name change half way though the thread.

killix commented 9 years ago

No, i have just started to resolve this issue for my needs in the same time, i think... My initial plan is making rewrite for use some stuff like promise and ramda ;)

silentrob commented 9 years ago

That is kinda funny. I thought i was chatting with @botisma all along. Sorry. Well Im happy to get a pull request from anyone. Even if you can get it close, I have no problem finishing it off.

You wanted to re-write all of superscript to use promise and ramda? That sounds like a huge task.

killix commented 9 years ago

no problem. Yes is huge task but my plan is start by small part by small part.

killix commented 9 years ago

PR, all seems work on my side

silentrob commented 9 years ago

Okay I will review it.

silentrob commented 9 years ago

So it appears we are not scoping the conversation reply correctly.

Given

+ i went to highschool
- did you finish ?
  + *
  % did you finish ?
  - i went to university
  - what was it like?

+ i like to travel
- have you been to Madird?
  + ~yes *
  % have you been to Madird?
  - Madird is amazing.

  + ~no *
  % have you been to Madird?
  - Madird is my favorite city.

The * is trying to match at the topic level and not within the conversation / reply.

It shouldn't be to hard to fix, and we are still much further ahead!