kagisearch / smallweb

Kagi Small Web
https://kagi.com/smallweb
MIT License
481 stars 246 forks source link

Add OPML Generation #208

Closed sblaplace closed 6 months ago

sblaplace commented 6 months ago

I've added code to generate an OPML file. It includes a function, update_opml(get_urls) in sw.py which generates an OPML file, and adds the opml router path. I also updated the dependencies in requirements.txt, added a link to the html template, and added a COPY line to the dockerfile for smallweb.txt.

The get_urls parameter is for whether it attempts to get feed metadata - I initially looked for a way to reuse the feed metadata that kagi already gets, but since the full feed generation is done through the kagi API, I didn't see a way to hook into that to reuse the metadata from those requests. If set to False, it instead simply puts the links into the OPML file on their own.

I hope this is sufficient - Thank you for considering including this feature.

vprelovac commented 6 months ago

Looks good, thanks for including update to Readme as well

vprelovac commented 6 months ago

@sblaplace weirdly when building this locally I get issues with

from opml import OpmlDocument

Any ideas?

pip install pyopml
Requirement already satisfied: pyopml in /Users/prelovac/.pyenv/versions/3.11.4/lib/python3.11/site-packages (1.0.0)
Requirement already satisfied: lxml>=4.6 in /Users/prelovac/.pyenv/versions/3.11.4/lib/python3.11/site-packages (from pyopml) (4.9.2)

python
Python 3.11.4 (main, Oct 16 2023, 12:27:55) [Clang 15.0.0 (clang-1500.0.40.1)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> from opml import OpmlDocument
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: cannot import name 'OpmlDocument' from 'opml' (unknown location)

>>> import opml
>>> 
sblaplace commented 6 months ago

Odd, this is what I get

[user@code ~]$ python3
Python 3.12.1 (main, Dec 18 2023, 00:00:00) [GCC 13.2.1 20231205 (Red Hat 13.2.1-6)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import opml
>>> from opml import OpmlDocument
>>> 

The only immediate difference I see is that I'm on Fedora and Python 3.12, but I'm not sure how or why that would come into play here, given the same version of pyopml.

vprelovac commented 6 months ago

Yeah very odd, I see we are getting the build error on deployment as well, waiting to hear from our devops what is the cause - thought it might be the same

vprelovac commented 6 months ago

@sblaplace how long does it take for you to create OPML document during boot? Have you tried this locally and /opml endpoint works?

vprelovac commented 6 months ago

Yeah it appears it is going to be a multi hour operation which we can not have done at boot. Ideas?

sblaplace commented 6 months ago

I see since this that get_urls was set to default as False - Did that fix the speed issues? After looking over my code again, I believe that lines 155 and 157 should also not be assigning to opml_document again, since opml_document.add_rss returns an OpmlOutline rather than an OpmlDocument. I have a commit with this change on my own fork, if desired.

vprelovac commented 6 months ago

Yes that fixed the speed issue, but opml is not working due to second reason you outlined, please submit a PR