StampyAI / stampy-ui

AI Safety Q&A web frontend
https://aisafety.info
MIT License
35 stars 9 forks source link

Embedded search for other websites #223

Closed Aprillion closed 1 year ago

Aprillion commented 1 year ago

Create a new route without just the search bar + search results as links to the question on aisafety.info, so that other websites can embed stampy search via ifame.

Notes: Showing answers on other web sites would be more complicated because iframes don't have dynamic height, and recommending usage of our API with some SDK in React/Vue/Svelte would be a lot of maintenance work we cannot promise to commit to, so something simple for an iframe sounds like the best option initially.

joepio commented 1 year ago

Thank you for putting this on the roadmap!

Just to reiterate: a REST API that takes a query and returns an array of QA combinations would be preferred to me, as I can then create my own UI for it that perfectly matches the look and feel of the thing I'm building. I'm assuming you already have some sort of API, I'll gladly just use that. Hopefully that just means opening up some endpoint / port and tweaking some cors, right?

ccstan99 commented 1 year ago

@joepio So would something like https://nlp.stampy.ai/api/search?query=<query> but instead of (or in addition to) returning the link, you'd like the actual answer?

joepio commented 1 year ago

@joepio So would something like https://nlp.stampy.ai/api/search?query=<query> but instead of (or in addition to) returning the link, you'd like the actual answer?

A JSON array of question answers pairs sorted by relevance would be best I think.

Yes!

ccstan99 commented 1 year ago

That already returns a JSON array of Q&A sorted by relevance, except the answers are urls rather than the "text" of the answer. See https://nlp.stampy.ai/api/search?query=What+is+AGI

joepio commented 1 year ago

Ah I see now, great!

But indeed, if the answers are included I can get to work!

ccstan99 commented 1 year ago

OK, I'm going to close the issue in this repo since it'll be dealt with https://github.com/StampyAI/stampy-nlp/issues/14. Unfortunately, I won't have a chance to work on this until next weekend but hopefully, it shouldn't take too long.

Aprillion commented 1 year ago

@joepio FTR the current way how we use our API is to fetch https://aisafety.info/questions/allQuestionsOnSite first (which we cache in CloudFlare, so hopefully ready for scaling if we suddenly get popular), filter by status Live on Site (other statuses are used for more features so we fetch everything) and run in-browser plaintext search for queries up to 2 words, and there is a small-model semantic search also in browser (the small model was not matching single words very well, so we decided for the combined approach)

the response from allQuestionsOnSite was supposed to only have question titles without answer text, but looks like there is a bug and there is answer text in that API too 😅 the official way to get answer text as HTML was supposed to fetch it from https://aisafety.info/questions/${pageid} e.g. https://aisafety.info/questions/2374

the full search from nlp.stampy.ai (which might not scale to thousands+ concurrent users .. yet) is run from our UI only if the user clicks on I'm asking something else

but if ccstan99 and the team will add answer text directly to nlp.stampy.ai, I agree that's a better way how to provide the search service to other websites 💯

ccstan99 commented 1 year ago

Oh sorry @Aprillion, did I close this prematurely? Maybe others might still want to embed stampy-ui in an iframe without using the NLP API? In theory, the NLP API should be able to scale just a matter of cost... but we'll see. My main concern was "unnecessary" traffic sending all those individuals characters as the user types seemed better to handle in-browser.

Aprillion commented 1 year ago

no worry, this ticket was about a PoC for https://pauseai.info while we didn't have proper API solution (that we can promise to maintain), but nlp.stampy.ai is hopefully going to be the real longterm solution - maybe we will need to update some aspects of it in the future, but I believe it's going to be the best way to share the search with other web sites 🎉

joepio commented 1 year ago

Could you add a cors header to the response that allows all origins? Or in this case allows pauseai.info and localhost.

Access-Control-Allow-Origin: https://pauseai.info <= more secure

Access-Control-Allow-Origin: * <= easier

joepio commented 1 year ago

It works locally! Still need the cors stuff to deploy

https://user-images.githubusercontent.com/2183313/236672282-fe9e4256-967a-4149-9470-1a330fbd6d34.mov

ccstan99 commented 1 year ago

Cool! One request, it can get expensive for us to maintain if there's excess traffic. Could you please call the API after the user has a "complete" query (i.e. press ENTER or a button) instead of updating results with every keyboard character typed? Otherwise, we might have to impose some rate limiting scheme. Our website uses a different search scheme that Aprillion described above.

@mruwnik will try to look into the cors header when he has some time. https://github.com/StampyAI/stampy-nlp/issues/16

joepio commented 1 year ago

Will do, thanks! 😇

mruwnik commented 1 year ago

You can have a look at this PR: https://github.com/StampyAI/stampy-nlp/pull/18 The CORS allowed origin will be '*' by default, but you can override that with an ALLOWED_ORIGIN env variable to set a whitelist of allowed origins.

There's a deployed dev version at https://stampy-nlp-dev-t6p37v2uia-uw.a.run.app that you can use to check it out - it seems to work for now:

image
joepio commented 1 year ago

I'm currently getting 500s from the new url (both from browser fetch as opening the url in browser), maybe its down?

mruwnik commented 1 year ago

yup, my fault. Try now