juglab / cryoCARE_pip

PIP package of cryoCARE
BSD 3-Clause "New" or "Revised" License
25 stars 14 forks source link

Change the output name to something independent from odd/even? #59

Open rdrighetto opened 5 months ago

rdrighetto commented 5 months ago

I am not sure how best to address this, so I'm raising this issue for brainstorming:

When predicting, the output is currently written with the same file name as the even tomogram supplied: https://github.com/juglab/cryoCARE_pip/blob/f34597d5c0403beadc6146e8b5e78f344d94744d/cryocare/scripts/cryoCARE_predict.py#L153

This is confusing to many users, as it seems like the output depends only on the "even" tomogram, when in fact it's the average of the denoised odd and even tomograms.

Potential issues I see:

  1. It's hard to know what a "good" output name looks like, since in principle the input odd and even tomograms can be called anything (and I think this is how it should be)
  2. It's even harder in the case where multiple tomograms are being predicted at once, and we need to identify them uniquely in a way that is easily traceable to the original inputs

One way I see to address this "correctly" is to require explicit output names for every tomogram being predicted, like we already do with the inputs. This is a major'ish API change IMO.

Any thoughts?

TomCrey commented 5 months ago

In my opinion, the simplest way would be to append a specific output name: instead of just using the basic name of the even tomogram, you could add a recognizable suffix or prefix to indicate that the output is the result of denoising the odd and even tomograms. For example, replace line 153 as follows: out_filename = os.path.join(config['output'], "denoised_" + os.path.basename(even))

This would make it clearer that the output is not just based on the even tomogram.

rdrighetto commented 5 months ago

Hi @TomCrey!

Thanks for the suggestion! The problem I see with doing it this way is that it still contains the name of the even tomogram in the file name, so I'm not sure it would solve the confusion.

TomCrey commented 5 months ago

Hi @rdrighetto! I don't think it's the perfect method, but it would reduce the confusion. Alternatively, we maybe could allow users to specify custom output file names for each predicted tomogram. This would involve modifying the predict_config.json script to accept output file names as output name (but I wouldn't know how to configure this in the json). While this may represent a significant API change, it could offer the greatest flexibility and clarity for users.