gcerretani / antenati

Tools to download data from Portale Antenati
MIT License
27 stars 9 forks source link

Feature Request: Pages #10

Closed jbellanca closed 2 years ago

jbellanca commented 2 years ago

I'd like to request a feature where you could specify the start and end page numbers, and have it just download those images. Would be very helpful to just grab the last 8 pages of an index, for example.

gcerretani commented 2 years ago

A very interesting feature, thanks @jbellanca. I'll try to implement it!

alexreg commented 2 years ago

I second this. And ideally, a shorthand to fetch just a single page by its number.

Thank you for publishing this excellent tool, in any case.

alexreg commented 2 years ago

Hi @gcerretani. I got inspired and ended up implementing this feature, along with a couple of other small things.

https://github.com/alexreg/antenati/tree/local

Feel free to merge in some/all of these changes, if you like.

You'll have to forgive me for switching CLI parsing to the Cloup library... I have already been looking for an excuse to play around with it, but also I didn't see a way to make optional (positional) arguments using argparse. Regardless, it should be easy enough to ignore that change.

gcerretani commented 2 years ago

Done on 7c40b4bc8185ee9fccce33af372fcb4b24e91630 (sorry for the bad issue reference on the commit message, 1 instead o 10!)

gcerretani commented 2 years ago

You can select the range using new command line options. For example:

antenati.py URL downloads everything.

antenati.py URL -f 0 downloads everything.

antenati.py URL -f 10 downloads everything starting from page 10.

antenati.py URL -l 11 downloads from first to page 11 excluded (11 pages, first index is 0).

antenati.py URL -f 10 -l 11 downloads from fome page 10 to page 11 excluded (1 page).

Also negative indexes can be used:

antenati.py URL -f -4 downloads only last 4 pages.

antenati.py URL -l -3 downloads eveything except last 3 pages.