py-pdf / pypdf

A pure-python PDF library capable of splitting, merging, cropping, and transforming the pages of PDF files
https://pypdf.readthedocs.io/en/latest/
Other
8.07k stars 1.39k forks source link

API inconsistency in `RectangleObject`? #1842

Closed lpozo closed 1 year ago

lpozo commented 1 year ago

Explanation

I guess there is a type mismatch in the API of RectangleObject. For example, properties like .lower_left (and similar) return a tuple. However, their setter methods take a list object instead of a tuple. Would it be more consistent to take a tuple too?

Code Example

How would your feature be used? (Remove this if it is not applicable.)

>>> # Current usage
>>> first_page.mediabox.upper_left = [0, 480]

>>> # More consistent?
>>> first_page.mediabox.upper_left = (0, 480)
pubpub-zz commented 1 year ago

After checking the system is more tolerant than indicated by the annotation : any Iterator of 2 floats is accepted. As you have detected the point, you can propose a PR replacing the List with an iterator to be match the acceptable types

lpozo commented 1 year ago

Since these are properties that return two-tiem tuples, accepting a two-item iterable in the setter is kind of misleading and violates the least astonishment principle. So, probably the way to go would be to use Tuple[float, float].

MartinThoma commented 1 year ago

Thanks for the reminder and sorry for the delay :see_no_evil:

I can confirm that this is inconsistent:

    @property
    def lower_left(self) -> Tuple[float, float]:
        """
        Property to read and modify the lower left coordinate of this box
        in (x,y) form.
        """
        return self.left, self.bottom

    @lower_left.setter
    def lower_left(self, value: List[Any]) -> None:
        self[0], self[1] = (self._ensure_is_number(x) for x in value)

I'm uncertain if we should enforce tuples for the input:

Changing it to an iterable would be ok for me, but that would mean that people would not get feedback from mypy if the iterable has just one element. So enforcing a tuple with two elements is better.

TL;DR: Please make a PR changing those getters from List[Any] to Tuple[float, float] :-) @lpozo Do you want to do it?

postmeback commented 1 year ago

Hi @MartinThoma ,

I have started in Python and would like to contribute here. Can you tell me, in which file do I need to do the change?

MartinThoma commented 1 year ago

You need to modify pypdf/generic/_rectangle.py :-)

MartinThoma commented 1 year ago

@lpozo If you want, I can add you as a contributor: https://pypdf.readthedocs.io/en/latest/meta/CONTRIBUTORS.html - I value all kinds of contributions, also and especially people pointing out issues :-)

As @postmeback seems to be eager to contribute, I would let him do it. It's nice if people get a learning experience (and maybe contribute more over time :-) )

lpozo commented 1 year ago

@MartinThoma Hey! Yes, I'd be glad to be part of the contributors' team. I've been using pypdf for a while now and the code has improved a lot since PyPDF2 and the like. The API is easier to work with and more Pythonic. So, you've done a great job here.

I can start contributing to the project. However, I have to say that I don't really have much time right now. So, my contributions might be small and quick fixes in most cases. I can also point them out as issues since this will take less time for me.

I came across this issue while I was updating this article over at Real Python: https://realpython.com/creating-modifying-pdf/

The article covers many aspects of pypdf. It's a great starting point for people learning pypdf. I'm not the original author by I just updated all the code from PyPDF2 to pypdf.

MartinThoma commented 1 year ago

The issue has been addressed. The updated version of pypdf will be released latest on Sunday (2nd of July).

my contributions might be small

I value all contributions, including just pointing out issues :heart: We are all here for the helping others + the joy creating good software brings :-)

just updated all the code [in the Real Python article] from PyPDF2 to pypdf

That is amazing! I like realpython.com and I stumble from time to time over good articles there. I was writing them that the article was seriously outdated (at least I think so - I wrote a couple of article authors). I'm very happy that you did it :heart: