Closed matt-boris closed 1 year ago
I'm going to just run the tool multiple times on each input folder and have multiple outputs. I'd love to be able to run all at once and use the divide into dates feature!
Oh shit :fearful: never thought this would happen
run the tool multiple times on each input folder
Best solution would be to:
Don't do it on each unzipped as-is because zips are fragmented randomly, and contenst of one "year folder" may be fragmented over those zips
Like honestly i don't have any good idea how this happens... heaviest that Media
class could weight is, idk, 128bytes? 128bytes * 31442 ~= 4 MB
Maybe updated Dart will help when i do new release...
If there is any Dart expert that can identify why, pls help
- Are you using interacive (i suppose you're not?) ?
Yeah, I'm not.
- Can you send a screenshot of memory usage? With how much gpth takes over time
I'll get around to this sometime either this weekend or the upcoming week. I'm not sure how in-depth I can get for you on a Synology NAS, but I'll do some digging around.
Maybe updated Dart will help when i do new release...
Probably wouldn't hurt! 🤞🏻
@TheLastGimbus So this is over about 30 min of running the tool. Really interesting to see the massive jumps in memory utilization. gpth
is using about 6GB of memory at the time when it gets killed.
I wonder if https://docs.flutter.dev/development/tools/devtools/memory can be used to determine if there's a memory leak of some sort.
Bumped dart version unfortunately didn't help :( I'll keep trying stuff!
Trying to use the DevTools to see what's eating up all the memory.
Using dart --enable-vm-service ./bin/gpth.dart --input <input> --output <output> --copy --divide-to-dates
I could send you the memory dump once this is done if you'd like?
yesss that would absolutely help
i stared making nighly builds with some options disabled, but looks like you've got Dart figured out:
my theory is that may be something wrong in reading jsons/exifs?
could you please try disabling (commenting out) json/exif/both extractors here (guess can be left enabled):
Save these into CSVs https://paste.mozilla.org/2qBN5kiA, https://paste.mozilla.org/rtDtAees
I've no idea where this export button within DevTools downloads anything to on my local machine. Will ping you if I find it 😬
@TheLastGimbus yeah it's definitely those date extractors! This run, the tool flew through the input (all 30k+ files) and is already copying them to their destination folder. This is the furthest I've gotten now 🎉
Would still be nice to get dates on these files though :)
I'll bet you I run into my issue when this line is hit on a large file (multiple GB video files) and my system just runs out of memory. https://github.com/TheLastGimbus/GooglePhotosTakeoutHelper/blob/38ea053b60253f1ff3c8eb3d89c9fe7b8aeee6fb/lib/date_extractors/exif_extractor.dart#L10
The CPU on the NAS may not be fast enough to garbage collect it all in time before more bytes are read in.
If this exif extractor was a little smarter around its memory usage, that'd be great, since I'd still be able to benefit from it with all other files that don't affect the memory nearly as much.
oh shit...
got it, will fix in a second...
done! fixed with e0d9ee3e71def69d74eba7cf5ec204672924726d / https://github.com/TheLastGimbus/GooglePhotosTakeoutHelper/releases/tag/v3.3.3
While guessing the dates from the files, my system (8GB RAM) runs out of memory :(
I get to about here before the system kills the script due to memory.
Any suggestions? Could files be written to the output folder on an ongoing basis instead of keeping all the info in memory?
Thanks again!