PhakphumV / mitmai

A LINE Chatbot to check if an URL is legit or phising. น้องแมวแชทบอทเช็คลิงค์ว่ามิตรหรือมิจ
https://lin.ee/71sXy5I
2 stars 1 forks source link

To extract keywords from <head> and <meta> of the given URL and identify if there is any inappropriate contents #4

Open PhakphumV opened 2 months ago

PhakphumV commented 2 months ago
<html lang="th" class="no-js">
<head>   
    <title>[TITLE]</title>
<meta charset="utf-8">
<meta http-equiv="X-UA-Compatible" content="IE=edge">
<meta name="viewport" content="width=device-width, initial-scale=1, minimal-ui, user-scalable=no">
<meta name="description" content="[DESCRIPTION]">
<meta name="keywords" content="[KEYWORDS]">

Send a GET request for the given url and parse HTML content Extract [TITLE], [DESCRIPTION], [KEYWORDS] Identify if there is any inappropriate keywords and flag the URL as 'inappropriate' if found.

PhakphumV commented 2 months ago

List of inappropriate keywords are defined in #5