DiamondLightSource / httomo

High-throughput tomography pipeline
https://diamondlightsource.github.io/httomo/
Other
7 stars 4 forks source link

Paganinsweep #521

Closed dkazanc closed 6 days ago

dkazanc commented 2 weeks ago

Fixes #424

Checklist

dkazanc commented 2 weeks ago

While working on this I thought that may be we could resolve the limits on the sweep preview defined by the GPU memory by using the CPU Paganin method instead?

In this case, we can be less conservative on what size of the preview we take, even if the whole projection image should be used for a very wide kernel. Also doing things like this doesn't feel very elegant and the main question how vertical_slices_preview_max can be estimated across different GPU devices. To what memory size we should be linking that value? It just feels a bit fragile at the moment. We could, potentially, swap the backend from httomolibgpu to httomolib for that method? It, however, can fail on some other GPU function, if we decide to take the previewed block size too large.

Worth having a chat about it @yousefmoazzam ?

yousefmoazzam commented 2 weeks ago

Yeah sure, I'm happy to discuss things at some point.

For now, I'd like to try and enumerate the different points raised for the sake of clarity, I think I counted three (but do correct me if there's more/less):

  1. the idea of using the CPU paganin filter in sweep pipelines that involve paganin, to try and reduce the threshold on the maximum number of vertical slices
  2. if we keep GPU paganin filter in sweep pipelines: the question of how can the calculation of the maximum vertical slices be made to work with multiple GPUs (ie, how to make the maximum vertical slices threshold generic across GPU models)
  3. regardless of if the paganin filter is CPU or GPU: the concern that the modifications to the preview to accommodate the paganin filter kernel size could cause GPU OOM errors for other GPU methods in the sweep pipeline
dkazanc commented 2 weeks ago

thanks @yousefmoazzam, you've summarised the issues well. So I'd say we either should go full CPU or GPU, no hybrid modes, as it will essentially result in the same OOM GPU issue at some point.

Of course the full CPU will make it slightly inconvenient to users as they would need to build a TomoPy analogue. As a possible solution (if we decided to go that way) is to create httomolib CPU duplicates of GPU methods in the httomolibgpu library. So that when the user would run the sweep pipeilne with Paganin involved, we would modify the module paths from httomolibgpu to httomolib leaving everything else intact. Some httomolib methods could import and reuse tomopy methods, if needed.

I know that it doesn't sound ideal, but it is one possible way to avoid potentially frequent situations when the large blocks won't fit the GPU resulting in constant warning messages to users.

if we're still deciding to proceed with the GPU implementation, I think we need to deal with the situation that the largest block defines the needed size for the sweep run. I suggest that the accepted blocks (bellow the upper limit) will be taken and therefore the list of parameters will be modified accordingly. For instance alpha = [0.1, 0.2, 0.3] is given but the blocks sizes are acceptable only for [0.1, 0.2] values. I'd discard 0.3 in that instance and proceed with the run for 2 parameters, rather than completely abort the whole run. So you can see, more hacks basically to make the GPU stuff to work...

dkazanc commented 2 weeks ago

OK, so as a conclusion of our discussion @yousefmoazzam , I'll do the following.

yousefmoazzam commented 2 weeks ago

Sounds good, worth a shot to see how it goes :+1:

It's worth pointing out since we didn't note it in the discussion earlier: with this approach we've addressed points 1 and 2 (in the points listed above in my attempted summary), but point 3 is still unaddressed. Dunno if you'd like more discussion before going ahead and trying this approach, or we just deal with point 3 later, it's up to you.

dkazanc commented 2 weeks ago

Actually to some degree the point 3 is addressed by relying on a memory hungry method (Paganin) to decide the maximum size for a block. I'm hoping that methods in pre/post Paganin pipeline require less memory than Paganin itself. We will see if this is the case when we implement our hack around memory estimation.

dkazanc commented 2 weeks ago

I think I need some help fixing this test please @yousefmoazzam . The reason why this test fail is that the source._data object here is not accessible (mocked?) in the test. I'm not sure if there any other way to obtain the unpreviewed shape of the raw data, that is what source._data.shape[1]does, or may be the tests itself needs to be modified?

I've tested this approach for 20,40,80Gb datasets (different number angles) and also different GPU cards, mainly P100 and V100. I didn't get OOM error so far. This factor defines the number of slices essentially for Paganin method not to break (or any other method in the pipeline I hope). I think the second memory hungry one is FBP so far, it doesn't break. But with Fourier recon we might want to reconsider and increase factor. thanks

dkazanc commented 2 weeks ago

And a follow-up, a slightly different approach to the one I suggested earlier. I do not kill the run even if the kernel width is larger than the block that fits the memory. I just take the largest block in this case and proceed with the run. The users still get the result that is smoothed and probably discard it themselves, but we're safer here from questions why the runs are so frequently terminated.

yousefmoazzam commented 2 weeks ago

I saw that the PR is now marked as ready for review, so I'm happy to go over it at some point soon :slightly_smiling_face:

For now, I'll try to answer this:

I think I need some help fixing this test please @yousefmoazzam . The reason why this test fail is that the source._data object here is not accessible (mocked?) in the test. I'm not sure if there any other way to obtain the unpreviewed shape of the raw data, that is what source._data.shape[1]does, or may be the tests itself needs to be modified?

Yep, the issue seems to be that:

I would say that the best way to resolve this is to not use the private attribute, and find another way to get access to the raw data global shape. This would involve more changes and some thinking of course, but I think it's the safer way to do things - I'm not sure there's many places where reliance on private members of an object is advocated.

If accessing the private attribute absolutely must be the way to do this, then I agree that the test will need to be changed to accommodate the assumption the code is making about the existence of the private attribute ._data. How it should be changed, I don't yet know, that will also require some thinking.

Before, the loader wrapper + loader mocks could be used because the sweep runner was not making any assumption about private members of a DataSetSource, so we didn't need to provide a "real" loader. This in turn made writing the test simpler, since we could use a mock loader instead of a real one. But now we need a real loader (because only a real loader has the private attribute ._data), and the test needs to be rethought a bit.

yousefmoazzam commented 2 weeks ago

An example of a potential way to solve this without doing the private attribute access would be to modify DataSetSource, and its implementors.

If it makes sense for any data source to have the information about the raw data's global shape (which should be checked if it's reasonable or not), then one could:

This change would be minimal, and would avoid doing any private member access.

This is just an example to illustrate that there are ways to go about this without the private member access, and to provide some sort of substance to my advice to "find another way to get access to the raw data global shape".

dkazanc commented 1 week ago

This is just an example to illustrate that there are ways to go about this without the private member access, and to provide some sort of substance to my advice to "find another way to get access to the raw data global shape".

Thanks, I've done some changes as you suggested, test_execute_modifies_block still fails but now on PreviewConfig is not being available in the loader as this is something we need, before it gets reassigned. Should make_test_loader take preview into account somehow?

yousefmoazzam commented 1 week ago

Thanks, I've done some changes as you suggested, test_execute_modifies_block still fails but now on PreviewConfig is not being available in the loader as this is something we need, before it gets reassigned. Should make_test_loader take preview into account somehow?

This is a similar issue to the previous one, but now with StandardLoaderWrapper (and not involving private member access, but instead involving implementation-specific assumptions):

Also similar to before, a potential solution would be to modify the protocol in question (LoaderInterface) and its implementor (StandardLoaderWrapper).

If it would make sense to have any "loader wrapper" provide the associated preview config (which does sound reasonable to me), then it may make sense to:

Side note: given that the loader wrapper would provide the preview if the above is done, and that the loader itself also would be able to provide this info, I'm becoming more wary of if the existence of the loader wrapper makes much sense (and if the wrapper should instead just go away, and loaders should be direct implementors of LoaderInterface). Tagging #504, as this is a relevant piece of info for that.

dkazanc commented 1 week ago

Thanks Yousef, this gets a bit more in-depth, but I will give it a go. Also the tests will be failing even if the preview stuff is sorted, because of this as source.raw_shape is not defined for DataSetBlock. Should it be though?

yousefmoazzam commented 1 week ago

Thanks Yousef, this gets a bit more in-depth, but I will give it a go. Also the tests will be failing even if the preview stuff is sorted, because of this as source.raw_shape is not defined for DataSetBlock. Should it be though?

Sorry, I'm not sure I understand: where is the need for DataSetBlock to have the .raw_shape property? The source variable in that function is only ever a value of the type returned by self._pipeline.loader.make_data_source(), and the type that function returns is DataSetSource. Could you point me to where DataSetBlock comes into the picture here?

dkazanc commented 1 week ago

Yes, its the tests for the sweep runs, they all build around using DataSetBlock rather than the loader where source.raw_shape would be available otherwise.

yousefmoazzam commented 1 week ago

Ah ok, thanks for the link, I see that the tests create a DataSetBlock for the block splitter to return (which gets the block from the mock loader), so that's one piece of the puzzle in understanding a bit better on my end.

I'm still not sure where there's a need for DataSetBlock to have .raw_shape though? The .raw_shape attribute is for data sources, and a DataSetBlock isn't a data source, it's something that is produced by a data source.

The function/purpose of the DataSetBlock created in the tests is that it's a block that is produced by the block splitter (which as I mentioned before, the splitter gets from the mock loader); the mock loader will already be configured to have the .raw_shape attribute from the changes made earlier in the PR, so source.raw_shape will work in the sweep runner now. And when the splitter is asked for a block here: https://github.com/DiamondLightSource/httomo/blob/fd8f6a6ed3ba96a483d4768d681faa18993ea807/httomo/sweep_runner/param_sweep_runner.py#L188

then the DataSetBlock created in the test is returned by the block splitter. After that, I can't yet see where the DataSetBlock has any need for a .raw_shape attribute.

So, there's something I'm still not understanding I think: could you point me to where exactly you think the DataSetBlock would need .raw_shape?

dkazanc commented 1 week ago

OK, so a couple of things here:

  1. I've done the preview introduction into Loader as suggested. In the new test tests_preview_modifier_paganin I'm passing the preview config into the make_test_loader and then create_autospec. The resulting preview in the loader is a mocked object still, but I was hoping it will be what I actually passed to it. Could it be that create_autospec won't support preview classes to be passed in, as I see that other variables are just normal types?
  2. Secondly on raw_shape again. In that test above source.raw_shape is also a mocked object, but I need an actual shape of the raw data in order to get something meaningful from the updated preview in the test. May be I should mock it somehow as apparently it shouldn't be in DataSetBlock, as you pointed earlier?
yousefmoazzam commented 1 week ago

OK, so a couple of things here:

  1. I've done the preview introduction into Loader as suggested. In the new test tests_preview_modifier_paganin I'm passing the preview config into the make_test_loader and then create_autospec. The resulting preview in the loader is a mocked object still, but I was hoping it will be what I actually passed to it. Could it be that create_autospec won't support preview classes to be passed in, as I see that other variables are just normal types?

The preview needs to be passed to the autospeccing of the LoaderInterface, currently it's not: https://github.com/DiamondLightSource/httomo/blob/fc38600be7b31254f730176d41ed66293beb5331/tests/testing_utils.py#L51-L64

The self._pipeline.loader.preview is essentially doing StandardLoaderWrapper.preview / LoaderInterface.preview, and because create_autospec() for LoaderInterface hasn't been given the preview, the mock loader wrapper doesn't know about the preview, so it won't have the preview value that was passed into make_test_loader().

  1. Secondly on raw_shape again. In that test above source.raw_shape is also a mocked object, but I need an actual shape of the raw data in order to get something meaningful from the updated preview in the test. May be I should mock it somehow as apparently it shouldn't be in DataSetBlock, as you pointed earlier?

Yeah that makes sense, by default the mock loader wrapper won't have an implementation of the raw_shape getter. Patching .raw_shape to return the global shape defined in the test would be a reasonable approach here I think.

In the past I've seen issues with trying to use mocker.patch.object() (which is how most things are patched in httomo tests - see make_mock_repo() in testing_utils.py for an example) for patching getter properties. Using PropertyMock seems to be the preferred way to mock getter properties. But of course, if you find a way to use mocker.patch.object() to patch the .raw_shape getter, feel free to do it that way.

dkazanc commented 1 week ago

@yousefmoazzam sorry even with the changes you suggested I still cannot get correct raw_shape from the loader. Can you have a look please?

dkazanc commented 1 week ago

ok, currently all tests pass for me except two test_insert_image_save_after_sweep and test_insert_image_save_after_sweep2, but it is expected as PR #523 should fix them. New test tests_preview_modifier_paganin tests the added functionality for calculating the new preview size for sweep runs based on the kernel size of the smoothing filter. Thanks for help @yousefmoazzam

dkazanc commented 6 days ago

One last thing that came to my mind is that this condition catches both TomoPy and Savu implementations but then proceed to work with TomoPy parameters, so most likely Savu implementation will fail here. As Savu implementation pads the data in the method itself I'm thinking to just let it work as a normal method by taking 5 (or whatever is default) slices. So I guess I change this condition to look for paganin_filter_tomopy specifically.