Open Pellarin opened 10 years ago
Agreed! First get the best 500 frames and store in one file, then analysis can be done later.
Good! Also there is a memory leak somewhere when the clusters are saved.
I added a function to collect the best models (PMI.io.input.save_best_models() ) but it doesn't read the RMFs in parallel. Probably we can wait til the next version to use that in the macro.
The analysis macro is really a mess, I reckon it. Mostly the problems derive from the huge file handling, memory parsimony requirements, and parallel calculation.
One issue is that the information of the rmf is extracted twice, before and after the clustering, making it really slow.
What about the following scheme:
1) Extract the frames into rmf files, suitably stored as single frame rmfs in a directory that can be used in the future for other clustering runs that uses the same matrix
2) when reading back the coordinates, just open the saved rmfs, and not the original huge rmf files.
That will make the rmf reading considerably faster.