BioMemPhys-FAU / domhmm

Open source package to identify nano- and microdomains in Molecular Dynamics simulations of bio-membranes.
https://www.biomemphys.nat.fau.de
GNU General Public License v2.0
1 stars 0 forks source link

Something to take care of! #14

Closed m-a-r-i-u-s closed 5 months ago

m-a-r-i-u-s commented 6 months ago

A summary of some points that came into my mind as I reviewed the code

yusuferentunc commented 5 months ago
  • [x] Vectorize the functions order_parameter and calc_order_parameter. The reason I implemented it like that was to (really) ensure that the angle between the right beads are calculated. However, the usage of the two for-loop would kill performance. I think we can trust MDAnalysis that is doesn't randomly re-orders atoms in a selection group, but we cannot trust users. Therefore, it would be nice to have a kind of "safety net" in base.py that checks the distance between two subsequent beads/atoms in a selection. If the distance is larger than a pre-defined cutoff (e.g. the physical distance between two C-atoms, which is ~0.15nm) the program should raise at least a warning.

Vectorisation is done in #10 Safety net will be implement in further works

yusuferentunc commented 5 months ago
  • [x] Only the average of the S_CC values over a single chain should be stored, if we store single S_CC values the results array will be pretty huge. However, we should have a unit-test that checks the correctness of this function.

I believe mean is done at https://github.com/m-a-r-i-u-s/domhmm/blob/0e7c0dca0f1dcca9e7ffef3be6e33c4fb63b6316/domhmm/analysis/domhmm.py#L124

m-a-r-i-u-s commented 5 months ago

Yep! Fine.

yusuferentunc commented 5 months ago
  • [x] Size of the weight matrices scale quadratically with the number of lipids. For large membranes (lateral length > 50nm) jupyter notebooks keep crashing due to "too big" numpy arrays. I thought about the usage of sparse matrices (e.g., from scipy module) since the weight matrices have a high sparsity (a lot of lipids are not connected with each other). I tested already a little bit the coo_array format from scipy and that worked quite well because it seems to preserve a number of basic numpy.array functionalities.

Sparce matrix implementation for weight matrices done at branch weight_matrix_optimize. Scipy Sparse's csr_array is selected for compressing method since it works just like a numpy array comparing to coo_array.

Pull request will be open whenever GitHub actions' monthly quota renewed.