Feature/saving by directory management

faildeny commented 2 years ago

This PR addresses the problem of saving large amounts of samples. This approach uses model package defined way of saving outputs and then moving all generated batches to output directory.

The one thing I cannot do nicely is adding the sample index to the final filenames. (For cases where models outputs several files for one sample like image and mask)

faildeny commented 2 years ago

According to #37, saving samples in such a format (every output type in the same directory) makes it hard to compute fid. Do you think we could split them into separate directory?

RichardObi commented 2 years ago

According to #37, saving samples in such a format (every output type in the same directory) makes it hard to compute fid. Do you think we could split them into separate directory?

Thanks for making me aware of issue #37 and the case for FID. As you also pointed out, the second option would in #37 would make FID (and similar) calculations more straightforward, but, on the other hand, an update of medigan's guidelines and models (e.g., 00004, polyp models) would be needed. To avoid the need to guideline and model updates while still having maximum flexibility of file storage pattern for users, I suggest a hybrid approach: An additional parameter of the generate function, e.g., create_output_folder_by_type: bool. If true, medigan separates the outputs of different types (e.g., masks and imgs) into separate folders as suggested in option 2 in #37. Else, all generated files are in the same folder as in option 1 in #37. The function in medigan's model_executor that does this could detect different types of outputs by checking for corresponding j as described in the comment above.

faildeny commented 2 years ago

Bugs fixed and all tests pass.

RichardObi / medigan

Feature/saving by directory management #36