Marcraven commented 3 months ago

This pull request is a generealized vectorization of most procedures in xrd_simulator.

The main idea behind is to increase computation speed an facilitate code parallelization whenever possible. The motivation behind was my intention to use this code to simulate powder samples with several million tetrahedra.

The main files affected are:

mesh.py

Once the mesh is created by meshio/pygalmesh, subsequent computations related to bounding spheres, centroids etc. have all been vectorized.

polycrystal.py

The _diffract method has been extensively modified to compute all diffraction points simultaneously for each phase present in the sample. There is still work to be done on the side of the tetrahedron-beam intersection, in which a faster bounding-box approximation has been implemented. The convex-hull calculation for the ScatteringUnit is still a bottleneck, but so far I have no idea how to improve it.

Another minor improvement has been adding zd and yd to the scattering unit to prevent repeating the calculation later in the rendering.

utils.py

Geometric calculations for circumspheres have been added here which help accelerate the mesh.py script a lot. A printvars utility has been added to help debugging memory issues during code execution.

Current bottlenecks

The tetrahedra-beam intersection
The convex-hull calculation
The intensity calculation during rendering

Vectorization of these last steps is not trivial and has to be thought through. The last one is probably doable although I could not find the time.

Suggestions

Include beam absorption physics
Allow for a range of wavelengths to be simulated rather than a single one
Do not impose >0 rotation for rigid body motions, perhaps one only wants to translate.

AxelHenningsson commented 2 months ago

Did these changes introduce new dependecies: i see "ModuleNotFoundError: No module named 'pandas'" in the logs.

Marcraven commented 2 months ago

Did these changes introduce new dependecies: i see "ModuleNotFoundError: No module named 'pandas'" in the logs.

Hi Axel, yes pandas is now used briefly in polycrystal to make the information more readable in tables. I could be done using purely numpy, but it is quite hard to debug a matrix with 17 columns being each column a different concept...

Marcraven commented 2 months ago

On the other hand there is a conflict in the test for arm64 and python 3.10.0 not being available for that architecture. Could the test be changed to Python 3.10.11 or some other build which is available for arm64 at https://raw.githubusercontent.com/actions/python-versions/main/versions-manifest.json?

AxelHenningsson commented 2 months ago

Hi again,

Yeah, 3.10.11 should work.... but there is a recent issue with macos-latest changing image from 13 to 14. I belive conda is more or less broken for now.... see this issue :laughing:

I have pruned the action set to just run all tests on a linux machine for now. Hopefully we can improve on this in the future....

That being said there seem to be tests failing here. Did you run a pytest on you local machine before opening this pull-request?

Cheers :slightly_smiling_face: Axel

AxelHenningsson commented 2 months ago

@Marcraven Could you add in the latest action changes into this request and we will see if the tests passes now.

Cheers Axel

Marcraven commented 1 month ago

In this pull request ScatteringUnit has two new arguments, zd and yd.

That is because in a normal pipeline, the intersection between scattering unit and detector is done twice, one to filter out the units that do not intersect and once to render.

This way, zd and yd can be passed from the first calculation to the second one to save computation time.

I am now re-writing test_detector to adapt it to this new configuration, but this is all open for discussion.

Marcraven commented 1 month ago

test_laue.py fails without using any new part from my code. The strain tensor generated in get_pseudorandom_crystal and passed to _epsilon_to_b is not a tensor, but a vector. When given a tensor, then _epsilon_to_b fails because F = np.linalg.cholesky(C).transpose(0,2,1) cannot transpose in 3D a matrix that only has 2D.

AxelHenningsson commented 1 month ago

On it. :)

AxelHenningsson commented 1 month ago

Hi @Marcraven,

I'm very happy to see you working on making the test pass. I'm looking forward to merging all of this.

Regarding your question, _epsilon_to_b is indeed supposed to take a vector of six values. This is why the call to strain_tensor = _strain_as_tensor(epsilon) at the start of the function is necessary—it converts the vector into a strain tensor.

I’m having trouble understanding the problem you’re encountering. The tests pass fine on my local Linux machine when I clone the current repo and install everything.

Could it be that you’re running a new version of utils._epsilon_to_b? If you have made recent changes, there might be discrepancies between our environments.

Let’s sync up to resolve this!

Cheers, Axel

AxelHenningsson commented 1 month ago

Your pull request confirms this suspicion with F = np.linalg.cholesky(C).transpose(0,2,1) being newly added code :)

Marcraven commented 1 month ago

I am sorry! this is what happens when you come back to your code after weeks...

I modified _epsilon_to_b to accept any number of crystal strains, i will make it work for 2D now...

@AxelHenningsson _epsilon_to_b would make more sense to receive a tensor instead of a vector. The reason for this is, in lab_strain_to_B_matrix you are turning crystal_strain into a vector as an argument of _epsilon_to_b, just to turn the vector again into a matrix inside the function.

Marcraven commented 1 month ago

In this pull request ScatteringUnit has two new arguments, zd and yd.

That is because in a normal pipeline, the intersection between scattering unit and detector is done twice, one to filter out the units that do not intersect and once to render.

This way, zd and yd can be passed from the first calculation to the second one to save computation time.

I am now re-writing test_detector to adapt it to this new configuration, but this is all open for discussion.

What do you think of this @AxelHenningsson ?

AxelHenningsson commented 1 month ago

Fantastic! No worries at all—if anything, I’m at fault for being slow to reply to the excellent work you’re doing!

Regarding the ScatteringUnit, that sounds great. Since the ScatteringUnit is primarily used internally within xrd_simulator, complexity is less of a concern compared to components that interface directly with the user. Good catch!

Cheers, Axel

Marcraven commented 1 month ago

_instantiate_eB in now iterates over phases instead of iterating over elements. This is much more eficient because the loop runs less than 10 times typically, instead of whatever number of tets we have. But in order to work like this B0 has to be computed in _instantiate_eB and passed down to lab_strain_to_B_matrix and then to _epsilon_to_B.

The result of all this is that now _epsilon_to_b takes (crystal_strain, B0) instead of (epsilon, unit_cell)

AxelHenningsson commented 1 month ago

Sounds great.

Are tests getting close to passing yet?

Cheers Axel

Marcraven commented 1 month ago

Hi @AxelHenningsson ,

test_beam, test_detector, test_laue, test_mesh, test_motion, are passing already. Most times the problem is related to dimensional mismatchs, annoying but not fundamental.

I will keep you posted.

AxelHenningsson commented 1 month ago

Sounds like good progress! 👌

Marcraven commented 1 month ago

This is ready to ship. The docstrings should still be verified to fit with current changes though.

The key issue limiting performance right now are convexhull operations. I dont know any vectorized way to intersect many tetrahedra with some polyhedra simultaneously, probably there is some GPU way of doing it.

AxelHenningsson commented 1 month ago

Fantastic! This looks good to me in general! 🙂

A corner case for the mesh spheres came up in my testing on your fork - it is related to loss of precision in the line segment and triangle sphere wrapper functions. The fix is just to expand the sphere radii very slightly to ensure that the triangle and line segments are actually containing the nodes they claim to contain. Without this I get bounding sphere radii that change as a function of pure rigid body motions which makes testing awkward. see commit d145326

I think we should add docstrings to the functions that are user exposed before we can merge this. It would be a shame to have a bunch of docs that are lying about the code. The starting point would be to hide the functions that are not supposed to be used by the user using the underscore notation _hidden_function. For the remaining functions that changed the docs should follow the same sphinx format as before, this will enable the docs to get autogenerated. It looks like you have already added docstrings in some places?

Cheers Axel

AxelHenningsson commented 1 month ago

@Marcraven : Please let me know if you will have the possibility to get the docs for this major pull request in shape (don't need to be perfect). It would be a shame to loose all the good progress. ❤️

Once this gets merged I think we deserve to roll up the version count and move from alpha into beta with a new releases on pypi and conda channels.

Cheers Axel

Marcraven commented 1 month ago

Hi Axel,

I am sorry, I want to do it this week. Hopefully tomorrow!

Un saludo

Marc Raventós Tato

On Tue, Jun 11, 2024 at 9:04 PM Axel Henningsson @.***> wrote:

@Marcraven https://github.com/Marcraven : Please let me know if you will have the possibility to get the docs for this major pull request in shape (don't need to be perfect). It would be a shame to loose all the good progress. ❤️

Once this gets merged I think we deserve to roll up the version count and move from beta into alpha with a new releases on pypi and conda channels.

Cheers Axel

— Reply to this email directly, view it on GitHub https://github.com/FABLE-3DXRD/xrd_simulator/pull/9#issuecomment-2161427419, or unsubscribe https://github.com/notifications/unsubscribe-auth/AF4OE7EIJAVHB76P5KXPGVDZG5C35AVCNFSM6AAAAABGCPAAVSVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRRGQZDONBRHE . You are receiving this because you were mentioned.Message ID: @.***>

Marcraven commented 1 month ago

The fix is just to expand the sphere radii very slightly to ensure that the triangle and line segments are actually containing the nodes they claim to contain.

I increased the radii by 0.1% in the cases of line and triangle segments. ¿Would that be enough?

AxelHenningsson commented 1 month ago

The fix is just to expand the sphere radii very slightly to ensure that the triangle and line segments are actually containing the nodes they claim to contain.

I increased the radii by 0.1% in the cases of line and triangle segments. ¿Would that be enough?

Sounds good!

FABLE-3DXRD / xrd_simulator

Vectorization of xrd_simulator #9

mesh.py

polycrystal.py

utils.py

Current bottlenecks

Suggestions