augustwester / searchthearxiv

The code powering searchthearxiv.com, a simple semantic search engine for more than 300,000 ML papers on arXiv.
https://searchthearxiv.com
GNU General Public License v3.0
112 stars 11 forks source link

Add new category #5

Open monperrus opened 4 days ago

monperrus commented 4 days ago

Hi @augustwester Thanks a lot for the great website. It would be great to add other categories in searchthearxiv, such as cs.SE and cs.PL. What do you think? Thanks!

augustwester commented 3 days ago

Hi @monperrus,

Thanks for using searchthearxiv!

Restricting the site to ML related papers was mostly done to reduce the costs of 1) embedding all relevant abstracts and 2) storing the embeddings in the Pinecone vector database. While I could easily add the extra categories, I fear that it might be a slippery slope that will end up with me paying a larger monthly bill than I'm comfortable with.

monperrus commented 3 days ago

hi @augustwester

cs.SE and cs.PL are tiny compared to ML so the cost increase will be minimal.

What if we find a sponsor for the operational costs?

Such as OpenAI, Pinecone or Arxiv?

What do you think?