Closed lifesminder closed 1 month ago
This is a good idea considering how large the archives are getting, but there's a couple changes that would need to be made:
file_digest
which isn't available to us with the version we use, but they have chosen a size of 218 (262144) which I think we should use instead and will make that take less time to process: https://github.com/python/cpython/blob/1f4a49ea53516e7ff177beedc09a1e4439b3be1f/Lib/hashlib.py#L195This is a good idea considering how large the archives are getting, but there's a couple changes that would need to be made:
- There is trailing whitespace on the three empty lines you've added. That is why one of the tests has failed and that would need to be removed.
- The buffer size should be increased. The common knowledge of using 4k/8k chunks is somewhat misapplied in situations where the file is always going to be large. Python added the helper function
file_digest
which isn't available to us with the version we use, but they have chosen a size of 218 (262144) which I think we should use instead and will make that take less time to process: https://github.com/python/cpython/blob/1f4a49ea53516e7ff177beedc09a1e4439b3be1f/Lib/hashlib.py#L195
cool, for these purposes I have added an extra chunk_bytes
argument, in order to make it scalable and set required size in specific scenarios.
about whitespace I can't say much, as I don't have an experience on python 2.x versions programming :)
@Ahrotahn I can't fix the code_check testing because whatever I do it doesn't change file but send an empty commit instead
what should I do?
There's a different problem this time: There needs to be two line breaks before unpack_downloads
. @rany2's suggestion should fix that.
The problem you had previously wasn't with the empty lines, it was that those lines weren't empty. They contained space characters.
Since your commits are signed by Github it seems you're using their editor, and unfortunately that doesn't have the option to show whitespace characters. If you happen to use a userscript extension there is a script that can be used to add that functionality: https://github.com/Mottie/GitHub-userscripts/wiki/GitHub-code-show-whitespace
Minor thing: you keep saying the default is 8 megabytes in the code and PR, but it's actually 8192 bytes (8 KiB). Anyway as stated in https://github.com/ungoogled-software/ungoogled-chromium/pull/3045#issuecomment-2385931844, the default is too low and should be increased.
Minor thing: you keep saying the default is 8 megabytes in the code and PR, but it's actually 8192 bytes (8 KiB). Anyway as stated in #3045 (comment), the default is too low and should be increased.
oh okay, which size should it be?
hi, is code reviewed yet?
You marked my suggestions as resolved without making any changes
You marked my suggestions as resolved without making any changes
so how to put them into commit? It's my first time to make a pull request, sorry for too much questions :3
There should be a button to add the suggestion as a commit. There has also been a change we recently merged that causes a conflict with these changes, so you would also need to use the resolve conflicts button to handle that.
Alternatively I could take some time later to rebase the changes and add those suggestions for you so long as your branch allows us to push to it.
well, made @rany2 's improvements and re-added chunk_bytes along with components. I guess it is the latest change of that function, so decided to sync
I think we'll have to squash commits before merging to tidy it up a bit.
@PF4Public The 'squash and merge' option should handle that for us
@PF4Public The 'squash and merge' option should handle that for us
yep, I meant just that
Thanks @lifesminder!
no problem :Ճ
Previous iteration checked whole file and caused memory issues, making programs non-responding. This one improves it, by reading chunks by bytes (default is 8 megabytes).
Tested in Fedora 40.