openzim / python-libzim

Libzim binding for Python: read/write ZIM files in Python
https://pypi.org/project/libzim/
GNU General Public License v3.0
62 stars 22 forks source link

Support iterators #94

Open rgaudin opened 3 years ago

rgaudin commented 3 years ago

Archive needs support for a few iterators:

rgaudin commented 3 years ago

Awaiting this to remove Archive.get_entry_by_id(), Entry.index and Item.index as it's the only way to loop over the zim content at the moment and we use it in tests.

mgautierfr commented 3 years ago

It would be nice to have a "default" iterator (iterEfficient?) on Archive itself.

This way we could iterate over archive with :

archive = Archive("foo.zim")
for entry in archive:
    print(entry.title)
rgaudin commented 3 years ago

absolutely 👍

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

stale[bot] commented 2 years ago

This issue has been automatically marked as stale because it has not had recent activity. It will be now be reviewed manually. Thank you for your contributions.

dentropy commented 11 months ago

Turns out zim-tools can list all the articles you need

ZIM Articles List

kelson42 commented 11 months ago

This feature is not directly of interest for us but is a MUST for people using python-libzim to ready/inspect ZIM files. How much effort would be needed to implement it?

rgaudin commented 11 months ago

It is of interest to us. We're using the workaround. I can't do it on my own in a reasonable time but I think it would be easy for @mgautierfr

kelson42 commented 11 months ago

@mgautierfr How much work woukd be needed to get this feature implemented?

mgautierfr commented 11 months ago

Should be pretty straightforward. Few hours, a day at most.

traverseda commented 10 months ago

It is of interest to us. We're using the workaround. I can't do it on my own in a reasonable time but I think it would be easy for @mgautierfr

What's the work around?

SrGnis commented 10 months ago

It is of interest to us. We're using the workaround. I can't do it on my own in a reasonable time but I think it would be easy for @mgautierfr

What's the work around?

I suppose is referring to this

traverseda commented 10 months ago

Thanks for the very quick response! Here's what that looks like in practice, in case this feature isn't added soon.

  zim = Archive(os.path.expanduser("~/test.zim"))                                                                                       

  for i in range(0,zim.all_entry_count):                                                                                                                         
      entry = zim._get_entry_by_id(i)                                                                                                                            
      print(entry)                                                                                                                                                                 
kelson42 commented 10 months ago

We will implement it soon