leon-ai / leon

🧠 Leon is your open-source personal assistant.
https://getleon.ai
MIT License
15.24k stars 1.25k forks source link

Have I Been Pwned Module #69

Closed iifeoluwa closed 5 years ago

iifeoluwa commented 5 years ago

Feature Use Case

✨ Checker Package - Have I Been Pwned module

Feature Proposal

Have Leon check if an email address has been compromised using https://haveibeenpwned.com/API/v2.

iifeoluwa commented 5 years ago

Going to begin work on this once I get the go-ahead from @louistiti

louistiti commented 5 years ago

Hello @iifeoluwa,

Awesome! I'm looking forward to see the results.

Quick thing, it would be nice to let the user setup pre-defined email addresses in the config.json. In that way, when the user say: Have I been pwned?, then Leon will check the email addresses in the config.json and request the Have I Been Pwned API. And if the user provides the email addresses in the query, like Is email@domain_name.tld has been pwned? then Leon can extract the email addresse(s) (here email@domain_name.tld) via the entities and pass them to the Have I Been Pwned API.

iifeoluwa commented 5 years ago

Hi @louistiti, thank you for the pointers.

I'd like to clarify the email address in config part though. To the best of my knowledge, Leon doesn't ask for an email address at any part of the install process. Should the user be able to manually insert an email entry in the package's config file, or do we set it as another ENV variable that is read?

Setting as ENV variable might be much less appropriate though because, in cases where user wants to enter >1 email addresses, we'd have to start employing things like delimiting email addresses with special character which isn't all that great.

louistiti commented 5 years ago

Indeed Leon does not ask for any email address during setup. But if you take a look at this part of the docs, you can see that every package has their own configuration, in your case, the configuration is located in packages/checker/config/config.json. You will need to add a key haveibeenpwned in this file with a sub-key emails that contains the list of email addresses for example. So it should be something similar as that:

{
  "isitdown": {
    "options": {}
  },
  "haveibeenpwned": {
    "emails": [
      "email@domain_name.tld",
      "another@domain_name.tld"
    ],
    "options": {}
  }
}

Leon should be able to find the email address in the query itself (e.g. Is my-email@tld.com has been pwned?). You can use the entities from your module file to extract the email addresses. You can take the Is It Down module as example as it extracts the domain names from the queries (cf. here).

If the user does not provide any email addresses in the query, then Leon will request the Have I Been Pwned API with the email addresses he can find in the config file. You can easily grab a config property value via utils.config().

iifeoluwa commented 5 years ago

Hi @louistiti, I encountered some hiccup testing this module and was hoping you'd have some pointers.

Following the instructions in the docs, I tried running this;

PIPENV_PIPFILE=bridges/python/Pipfile pipenv run python bridges/python/main.py en checker isitdown "Check github.com is up"

But what I got was this error;

Traceback (most recent call last): File "bridges/python/main.py", line 4, in <module> import utils File "/Users/ifeoluwa/Projects/forks/leon/bridges/python/utils.py", line 22, in <module> queryobjfile = open(queryobjectpath, 'r', encoding = 'utf8') FileNotFoundError: [Errno 2] No such file or directory: 'en'

This is happening because queryobjectpath is set to argv[1] (which is 'en'), and trying to open that as a directory, which it is not. Is there an error somewhere or am I missing something?

Thanks for your help!

louistiti commented 5 years ago

🔧 [1.0.0-beta.2] Improve NLU for more detailed queries

louistiti commented 5 years ago

I've recently updated the module execution from the Leon's brain. Leon now uses a file instead of using the args.

The standalone module execution works like this:

PIPENV_PIPFILE=bridges/python/Pipfile pipenv run python bridges/python/main.py server/src/query-object.sample.json

And the JSON file will depend of the module you are executing, so for the Is It Down module, it should look like this:

{
  "lang": "en",
  "package": "checker",
  "module": "isitdown",
  "query": "Check if github.com, mozilla.org and twitter.com are up",
  "entities": [
    {
      "sourceText": "github.com",
      "utteranceText": "github.com",
      "entity": "url",
      "resolution": {
        "value": "github.com"
      }
    },
    {
      "sourceText": "mozilla.org",
      "utteranceText": "mozilla.org",
      "entity": "url",
      "resolution": {
        "value": "mozilla.org"
      }
    },
    {
      "sourceText": "twitter.com",
      "utteranceText": "twitter.com",
      "entity": "url",
      "resolution": {
        "value": "twitter.com"
      }
    }
  ]
}

In your case, the entities that may be extracted from a query are email addresses. As Leon use NLP.js, the JSON entities array should look like that.

I'll update the docs within one or two days.

louistiti commented 5 years ago

I've edited the docs, you can take a look here :smiley:

Let me know if you have any questions.

iifeoluwa commented 5 years ago

Great, thanks!

louistiti commented 5 years ago

✨ Checker Package - Have I Been Pwned Module

louistiti commented 5 years ago

Done in https://github.com/leon-ai/leon/pull/80.