AdvancedPhotonSource / tike

Repository for ptychography software
http://tike.readthedocs.io
Other
29 stars 15 forks source link

NEW: Implement the multi-GPU-based bucket approach for laminography #153

Closed xiaodong-yu closed 3 years ago

xiaodong-yu commented 3 years ago

Purpose

Enabling single-node multi-GPU-based execution of the ray-based laminographic reconstruction.

Approach

Since both the object and projection are 3D data, we need to split both of them. The users can choose how many pieces of data they want to split the object to; then the number of pieces of projection can be determined based on the number of devices. Due to the object partitioning, a grouped all_reduce communication needs to be performed after each fwd operation.

Pre-Merge Checklists

Submitter

Reviewer

pep8speaks commented 3 years ago

Hello @xiaodong-yu! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! :beers:

Comment last updated at 2021-09-20 20:06:22 UTC
xiaodong-yu commented 3 years ago

@carterbox I update the communicators in accordance with PR#152 and document the new methods in pool.py.

xiaodong-yu commented 3 years ago

@carterbox I update the APIs and DOCs accordingly. Please have a review again. Thanks.

xiaodong-yu commented 3 years ago

@carterbox I update codes according to your review.

carterbox commented 3 years ago

@xiaodong-yu, if you copy all of the Pool/Communication related changes into a separate PR, we can merge the Pool/Communication related changes today. I will start reviewing the Laminography related changes now.

xiaodong-yu commented 3 years ago

The tests which check whether the Lamino operator is equivalent to the tomography operator and which test the reconstruction have been disabled. We need these tests in order to check that we are actually solving the Laminography problem. (the adjoint tests only show that the adjoint operation is adjoint!). I'm happy to discuss changes to these tests or logic behind these tests if needed.

I'm not sure how long it gonna take to enable it. I'm starting a long vacation from next week; if it is complicated enough that can't be finished today, then I might need to do it after I come back.

xiaodong-yu commented 3 years ago

@carterbox Are you talking about the unittest.skip in tests.test_lamino (line 212)? If so, can we do it in a separate PR? It is not related to parallelization.

carterbox commented 3 years ago

No. Some of the lamino tests are disabled with return statements?

xiaodong-yu commented 3 years ago

No. Some of the lamino tests are disabled with return statements?

I misunderstand your request. Now they both are enabled.

xiaodong-yu commented 3 years ago

Is there any reason to keep the old Fourier-based lamino solver?

Can you change the lamino reconstruction tests, so that it saves the result into the reconstruction folder? (like the ptychography recon tests, then it will be available as a build artifact).

The Lamino test results now can be saved.