This PR moves file filtering logic for global file parsing statuses from Python to the database. Our mongodb contains a 6-digit figure of parsed files, and the fact that this method downloaded the entire collection before parsing each file caused huge parsing delays.
The returned dict has a slight structural difference (machine_id instead of entire machine), but the only user of this function is wait_memory_free, which only looks at the file_size anyway.
This PR moves file filtering logic for global file parsing statuses from Python to the database. Our mongodb contains a 6-digit figure of parsed files, and the fact that this method downloaded the entire collection before parsing each file caused huge parsing delays.
The returned dict has a slight structural difference (
machine_id
instead of entiremachine
), but the only user of this function iswait_memory_free
, which only looks at thefile_size
anyway.