aiidateam / disk-objectstore

An implementation of an efficient "object store" (actually, a key-value store) writing files on disk and not requiring a running server
https://disk-objectstore.readthedocs.io
MIT License
15 stars 8 forks source link

🔧 MAINTAIN: Add typing #113

Closed chrisjsewell closed 3 years ago

chrisjsewell commented 3 years ago

~currently based on top of #112~

Come across this issue for get_objects/get_object_stream_and_meta: https://github.com/PyCQA/pylint/issues/3233

Types initially generated with https://github.com/Instagram/MonkeyType

codecov[bot] commented 3 years ago

Codecov Report

Merging #113 (f4e7169) into develop (2d2d32c) will decrease coverage by 0.12%. The diff coverage is 98.04%.

Impacted file tree graph

@@             Coverage Diff             @@
##           develop     #113      +/-   ##
===========================================
- Coverage    99.80%   99.68%   -0.13%     
===========================================
  Files            7        7              
  Lines         1553     1585      +32     
===========================================
+ Hits          1550     1580      +30     
- Misses           3        5       +2     
Impacted Files Coverage Δ
disk_objectstore/examples/example_objectstore.py 100.00% <ø> (ø)
disk_objectstore/container.py 99.64% <96.77%> (-0.36%) :arrow_down:
disk_objectstore/utils.py 99.60% <99.09%> (+0.21%) :arrow_up:
disk_objectstore/models.py 100.00% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 2d2d32c...f4e7169. Read the comment docs.

chrisjsewell commented 3 years ago

@ramirezfranciscof thanks for the review, but the other changes were specifically added to fix typing issues i.e. without them the type checking fails. So it doesn’t make sense to split them

ramirezfranciscof commented 3 years ago

Ah, really? That is very surprising. Can you tell me what kind of problem you were getting for each of those three changes? I'll be very interested in understanding how the typing affects that, since I'm trying to incorporate it myself when I'm coding and may run into these too 😅 .

On the other hand, perhaps it was wrong of me to limit my comment to "the other changes enumerated in the description", as there are also a couple of modifications that were not mentioned there but I was also referring to. For example, there is a modification in the __init__ of CallbackStreamWrapper where _update_every and _since_last_update get initialized outside of the if self._callback: condition. Was this also modified because of the type checking/hinting??

chrisjsewell commented 3 years ago

Was this also modified because of the type checking/hinting??

any changes were because of type checking

chrisjsewell commented 3 years ago

Remove @property from seekable method (as per BinaryIO), Disallow mode in LazyOpen (should always be readable binary), add enter/exit methods to PackedObjectReader, CallbackStreamWrapper, ZlibLikeBaseStreamDecompresser, so they won't fail in add_streamed_objects_to_pack with open_streams=True

All these changes are because PackedObjectReader, CallbackStreamWrapper, ZlibLikeBaseStreamDecompresser are intended to mimic a typing.BinaryIO object, i.e. what you get from open("path", mode="rb"), but they either did this incorrectly, in the case of seekable, or did not include enough of the BinaryIO interface, in the case of __enter__/__exit__.

ramirezfranciscof commented 3 years ago

Ah, ok, I see, so these are to fix the interface of typing.BinaryIO so it will recognize it when type checking. Although I still don't understand how this relates to the change inside the __init__ of CallbackStreamWrapper 🤔 ...type checking shouldn't go in the implementation inside the methods, no? Is having a _update_every and a _since_last_update members part of the "interface" of the typing.BinaryIO?

However, if this were modifications made "in preparation" for the type checking, this does not prevent us to have them in a first commit and then have a second one for just the type checking. I do agree now that it makes sense to have these on the same PR, but I still think the two commits are valuable to keep separated everything that is behavior-modifying, which would make it much easier to review (meaning not only for reviewing like what I'm doing now, since it is now a bit too late for that, haha, but I mean for revisiting in general if at some point in the future there is any need to debug or understand anything related to these changes).

chrisjsewell commented 3 years ago

the change inside the init of CallbackStreamWrapper

if instance attributes are set dynamically, I.e in sonde if statements, it means they may not be set and mypy cannot statically determine that. Basically you should generally always set instance attributes with a default value

chrisjsewell commented 3 years ago

if this were modifications made "in preparation"

if have to disagree with this I’m afraid; they made because of typing check failures, not in preparation. The commit would not be reviewable in isolation, since it would claim to fix typing issues that were not yet tested

ramirezfranciscof commented 3 years ago

I mean, yeah, I agree that this was the actual order in which things happened / reason why the changes were made. What I am saying is that conceptually the changes can make sense separated ("adapting the interface to comply with typing.BinaryIO" + "adding typechecking") and we would be gaining a lot in code/record clarity and development modularity. The logical connection of the changes being made for the typing would still be kept since they are in the same PR, so there is not really anything that is being lost in this reorganization. Am I missing anything here?

chrisjsewell commented 3 years ago

we would be gaining a lot in code/record clarity and development modularity.

Ah dude, I completely disagree; it won't provide any measurable benefit, its just overcomplicating things

But its really not worth falling out over lol, so my comprise is that I've made this commit and rebased the PR on it: https://github.com/aiidateam/disk-objectstore/commit/2d2d32c14fb4f5a0068fbbe49a16daf3285aeda8

chrisjsewell commented 3 years ago

Thanks 🙏