Closed interplanetarychris closed 5 years ago
I think that you are on to the problem. I changed my search path trying to narrow the problem down. Originally I was searching /Volumes/Local01/Pictures/2008 as a test which worked. Then I moved on to a larger directory /Volumes/Local01/Pictures/2015. That is when I go this error. That will not correct. Notice that the path is the first path.
Deleting ~/Library/Application Support/dupeGuru/cashed_pictures.shelve.db fixed this.
Application Identifier: com.hardcoded-software.dupeguru Application Version: 4.0.3 Mac OS X Version: Version 10.12.4 (Build 16E195)
Traceback (most recent call last): File "build/dupeGuru.app/Contents/Resources/py/shelve.py", line 111, in getitem KeyError: 'path:/Volumes/Local01/Pictures/2008/07/20080706_124659-5.jpg'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "build/dupeGuru.app/Contents/Resources/py/cocoa/inter.py", line 259, in pulse File "build/dupeGuru.app/Contents/Resources/py/hscommon/gui/progress_window.py", line 101, in pulse File "build/dupeGuru.app/Contents/Resources/py/core/app.py", line 323, in _job_error File "build/dupeGuru.app/Contents/Resources/py/hscommon/jobprogress/performer.py", line 43, in _async_run File "build/dupeGuru.app/Contents/Resources/py/core/app.py", line 780, in do File "build/dupeGuru.app/Contents/Resources/py/core/scanner.py", line 137, in get_dupe_groups File "build/dupeGuru.app/Contents/Resources/py/core/pe/scanner.py", line 31, in _getmatches File "build/dupeGuru.app/Contents/Resources/py/core/pe/matchblock.py", line 167, in getmatches File "build/dupeGuru.app/Contents/Resources/py/core/pe/matchblock.py", line 65, in prepare_pictures File "build/dupeGuru.app/Contents/Resources/py/core/pe/cache_shelve.py", line 121, in purge_outdated File "build/dupeGuru.app/Contents/Resources/py/shelve.py", line 113, in getitem KeyError: b'path:/Volumes/Local01/Pictures/2008/07/20080706_124659-5.jpg'
I tried again today to reproduce the error, to no avail. I'm thinking that the issue has to do with hash collision in the cache and that to reproduce this, a large number of photos are needed. I don't have a large photo collection so I can't reproduce.
As I write in #402, I really don't like the idea of fixing a problem blindly, but then again, because many users have a large photo collection (why would you use dupeGuru otherwise?), I'm going to do it. @silentnyte @interplanetarychris would you be willing to confirm or infirm the fix if I created a test build?
Certainly - happy to help debug and make the tool more useful. My image counts are often between 5,000 and 40,000.
-Chris
On Sep 17, 2017, at 9:12 PM, Virgil Dupras <notifications@github.com mailto:notifications@github.com> wrote:
I tried again today to reproduce the error, to no avail. I'm thinking that the issue has to do with hash collision in the cache and that to reproduce this, a large number of photos are needed. I don't have a large photo collection so I can't reproduce.
As I write in #402 https://github.com/hsoft/dupeguru/issues/402, I really don't like the idea of fixing a problem blindly, but then again, because many users have a large photo collection (why would you use dupeGuru otherwise?), I'm going to do it. @silentnyte https://github.com/silentnyte @interplanetarychris https://github.com/interplanetarychris would you be willing to confirm or infirm the fix if I created a test build?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/hsoft/dupeguru/issues/439#issuecomment-330074772, or mute the thread https://github.com/notifications/unsubscribe-auth/AGY65ICu_aWZYs60eklwZjmVEIqRHciKks5sjW8ygaJpZM4PI04G.
https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png https://github.com/hsoft/dupeguru https://github.com/hsoft/dupeguru/issues/439#issuecomment-330074772
I don't mind helping out.
From: interplanetarychris notifications@github.com Sent: Sunday, September 17, 2017 6:34:58 PM To: hsoft/dupeguru Cc: SilentNyte; Mention Subject: Re: [hsoft/dupeguru] Problems purging the cache (purge_outdated)? (#439)
Certainly - happy to help debug and make the tool more useful. My image counts are often between 5,000 and 40,000.
-Chris
On Sep 17, 2017, at 9:12 PM, Virgil Dupras <notifications@github.com mailto:notifications@github.com> wrote:
I tried again today to reproduce the error, to no avail. I'm thinking that the issue has to do with hash collision in the cache and that to reproduce this, a large number of photos are needed. I don't have a large photo collection so I can't reproduce.
As I write in #402 https://github.com/hsoft/dupeguru/issues/402, I really don't like the idea of fixing a problem blindly, but then again, because many users have a large photo collection (why would you use dupeGuru otherwise?), I'm going to do it. @silentnyte https://github.com/silentnyte @interplanetarychris https://github.com/interplanetarychris would you be willing to confirm or infirm the fix if I created a test build?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/hsoft/dupeguru/issues/439#issuecomment-330074772, or mute the thread https://github.com/notifications/unsubscribe-auth/AGY65ICu_aWZYs60eklwZjmVEIqRHciKks5sjW8ygaJpZM4PI04G.
https://cloud.githubusercontent.com/assets/143418/17495839/a5054eac-5d88-11e6-95fc-7290892c7bb5.png https://cloud.githubusercontent.com/assets/143418/15842166/7c72db34-2c0b-11e6-9aed-b52498112777.png https://github.com/hsoft/dupeguru https://github.com/hsoft/dupeguru/issues/439#issuecomment-330074772
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHubhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fhsoft%2Fdupeguru%2Fissues%2F439%23issuecomment-330098901&data=02%7C01%7Csilentnyte%40msn.com%7C2592d2d2db474edc50c808d4fe1c58e3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636412845075601909&sdata=0vOBLOzwVD7D%2B0e3LiWjHVh0O4Xn1qNKOB13vC0mvls%3D&reserved=0, or mute the threadhttps://nam02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FACUVYyiCz_XWdPWPQUme2uK3SCdkdxyJks5sjZ6SgaJpZM4PI04G&data=02%7C01%7Csilentnyte%40msn.com%7C2592d2d2db474edc50c808d4fe1c58e3%7C84df9e7fe9f640afb435aaaaaaaaaaaa%7C1%7C0%7C636412845075601909&sdata=l8lj%2F0COYJzZZS3FcPARDaCssrf%2FBB964evMpZ0h6c8%3D&reserved=0.
https://download.hardcoded.net/dupeguru_osx_4_0_3_shelvetest.dmg
This test version is the same as v4.0.3, but with the addition of the commit referenced above. @interplanetarychris @silentnyte Could you confirm that it works properly in situation where the vanilla v4.0.3 failed?
I've since been able to try the shelve version in similar scenarios as the prior failures. I have added and removed folders from the scan list to engage the additions and subtractions to the cache. I have yet to have it crash yet with file/image repositories ranging from a few thousand to about 35K. Thanks for the fix!
I got this error on 4.0.3, upgraded to 4.0.4, got the same error, deleted the cache manually, reran, and now it seems to be working. Does that mean the bug is gone from 4.0.4 or not?
Application Identifier: com.hardcoded-software.dupeguru
Application Version: 4.0.4
Mac OS X Version: Version 10.15.4 (Build 19E266)
Traceback (most recent call last):
File "build/py/shelve.py", line 111, in __getitem__
KeyError: 'path:/Volumes/bertha/!Sorteres/39.jpg'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "build/py/cocoa/inter.py", line 259, in pulse
File "build/py/hscommon/gui/progress_window.py", line 101, in pulse
File "build/py/core/app.py", line 323, in _job_error
File "build/py/hscommon/jobprogress/performer.py", line 43, in _async_run
File "build/py/core/app.py", line 780, in do
File "build/py/core/scanner.py", line 137, in get_dupe_groups
File "build/py/core/pe/scanner.py", line 31, in _getmatches
File "build/py/core/pe/matchblock.py", line 167, in getmatches
File "build/py/core/pe/matchblock.py", line 65, in prepare_pictures
File "build/py/core/pe/cache_shelve.py", line 121, in purge_outdated
File "build/py/shelve.py", line 113, in __getitem__
KeyError: b'path:/Volumes/bertha/!Sorteres/39.jpg'
Parts of path have been cut from the message. This happens when scanning a collection of 200.000 images.
Weird. I tried re-run dupeGuru on another folder, and got the same crash referencing the old folder (which was not part of the scan) again...
I'm doing a very large Picture/Content dupe check on repositories that include Lightroom and Aperture directories on local and AFP filesystems.
The file at the end was not included in the current search, but apparently there was a problem in purging the cache.
Application Identifier: com.hardcoded-software.dupeguru Application Version: 4.0.3 Mac OS X Version: Version 10.12.6 (Build 16G29)
Traceback (most recent call last): File "build/dupeGuru.app/Contents/Resources/py/shelve.py", line 111, in getitem KeyError: 'path:/Volumes/storage/Aperture Libraries/Genealogy Library (active) 3.aplibrary/Masters/2010/01/06/20100106-121355/00774_n_9aek3kmar0118.jpg'
During handling of the above exception, another exception occurred:
Traceback (most recent call last): File "build/dupeGuru.app/Contents/Resources/py/cocoa/inter.py", line 259, in pulse File "build/dupeGuru.app/Contents/Resources/py/hscommon/gui/progress_window.py", line 101, in pulse File "build/dupeGuru.app/Contents/Resources/py/core/app.py", line 323, in _job_error File "build/dupeGuru.app/Contents/Resources/py/hscommon/jobprogress/performer.py", line 43, in _async_run File "build/dupeGuru.app/Contents/Resources/py/core/app.py", line 780, in do File "build/dupeGuru.app/Contents/Resources/py/core/scanner.py", line 137, in get_dupe_groups File "build/dupeGuru.app/Contents/Resources/py/core/pe/scanner.py", line 31, in _getmatches File "build/dupeGuru.app/Contents/Resources/py/core/pe/matchblock.py", line 167, in getmatches File "build/dupeGuru.app/Contents/Resources/py/core/pe/matchblock.py", line 65, in prepare_pictures File "build/dupeGuru.app/Contents/Resources/py/core/pe/cache_shelve.py", line 121, in purge_outdated File "build/dupeGuru.app/Contents/Resources/py/shelve.py", line 113, in getitem KeyError: b'path:/Volumes/storage/Aperture Libraries/Genealogy Library (active) 3.aplibrary/Masters/2010/01/06/20100106-121355/00774_n_9aek3kmar0118.jpg'