Closed TomLucidor closed 1 month ago
For security reasons, web-browsers aren't allowed to load in other web pages.
Generally what people do is to implement a server-side script that will grab the other web page for any javascript that requests it. For example, javascript could ask a PHP script for a url, and then PHP will download that webpage and return it to javascript for further client-side processing.
Webcam's are different, as browsers offer javascript a built-in getUserMedia API to start the camera and get frames from it.
WebLLM offers you a lot of features, and you could easily implement web-page loading in your own project using the pipeline mentioned above.
Here's an example PHP script you could use:
<?php
# example of how you would call this script:
# webpage_downloader.php?url=https%3A%2F%2Fwww.example.com
$url_to_get = filter_var($_GET["url"], FILTER_SANITIZE_URL);
$webpage = file_get_contents( $url_to_get );
# or more complex:
#$opts = array(
# 'http'=>array(
# 'method'=>"GET",
# 'header'=>"Accept-language: en\r\n"
# )
#);
#$context = stream_context_create($opts);
#$webpage = file_get_contents('http://www.example.com/', false, $context);
# You can then encode the result for transportation back to javascript however you prefer:
#echo '{"webpage":"' . addslashes($webpage) . '"}';
#echo '{"webpage":"' . htmlspecialchars($webpage) . '"}';
echo '{"url":"' . $url_to_get . '", "content":' . json_encode($webpage) . '}';
?>
@flatsiedatsie but what about web crawling or web caching, are there ways to get it working with Web-LLM?
See my previous answer.
@flatsiedatsie to clarify, do you have any recommendation that uses Python or other tools rather than PHP? So that it can be hooked back into Web-LLM as a static file?
Ah, Python. No not really. But, if you're using it browser-based, it should be pretty similar to the PHP example. You could ask a WebLLM AI to turn the PHP code into Python code ;-)
Sorry for asking this, but it seems that this browser chatbot cannot read from (individual or lists of) webpages, which kind of makes this not as useful as expected. If this software could support webcams, could it also support caching small amounts of webpages (e.g. from a blog)? https://github.com/mlc-ai/web-llm/issues/291