Mintplex-Labs / anything-llm

The all-in-one Desktop & Docker AI application with full RAG and AI Agent capabilities.
https://anythingllm.com
MIT License
20.87k stars 2.18k forks source link

[BUG]: fetching website with 4 letters extension #963

Closed Kofangun closed 5 months ago

Kofangun commented 5 months ago

How are you running AnythingLLM?

Docker (local)

What happened?

error when i try to fetch website with 4 letters extension like .tech

Are there known steps to reproduce?

No response

timothycarambat commented 5 months ago

You can fetch any website that is available as its just about DNS resolution. There is no limitation for the TLD used. What is more likely is the website is blocking the scraper for any number of reasons. Some websites dont allow any kind of autonomous loading tools, like puppeteer - which we use.

Closing as wontfix for now. If you have any test websites you can post here I'm happy to debug or test on my end but that is likely the issue's root cause.

Kofangun commented 5 months ago

would you try https://komasterai.tech ?

timothycarambat commented 5 months ago

Worked for me, was able to get website text