xtracthub / xtract-service

Globus Labs Xtract: Extract metadata from distributed data sets.
6 stars 1 forks source link

Memory issues in prefetcher.py #34

Open tskluzac opened 3 years ago

tskluzac commented 3 years ago

We have multiple mapping dictionaries that grow enormously as we process metadata for over 2 million file objects. We need to figure out how to 'refresh' this dictionary so that it stops holding these enormous map dictionaries in memory. It looooooks like we won't be able to just pop the dictionary elements to do this. Will probably need to create some garbage collection thread...