This issue is to track the Text ingestion pipeline's working status.
Issues:
Title is not properly grabbed from articles when ingesting. -> DONE
Add the option for manually inserting a title for articles. - Implemented but not working
Add the option for manually inserting a title for unstructured text. - Implemented but not working
Add bot protection bypass/mitigation; -> DONE (simple fix, not long-term, but it works....)
Potentially look at headless browser for article scraping? -> Maybe using current user's cookies/session tokens if applicable? -> DONE (Archive box has a full setup + multiplatform, can rely on them for the client-side of things, look at firecrawl or similar for the 'server' version)
This issue is to track the Text ingestion pipeline's working status.
Issues: