jrkerns / pylinac

An image analysis library for medical physics
MIT License
146 stars 94 forks source link

RFC: proposed structure for field params #332

Closed alanphys closed 3 years ago

alanphys commented 3 years ago

Request For Comment

This is the proposed structure for an extensible field parameters module that does the same as the existing flatsym module but can be easily extended for all sort of parameters like FFF. Do: from pylinac import FieldParams fs = FieldParams.from_demo_image() fs.analyze(protocol='varian') or fs.analyze(protocol='all') print(fs.results())

I've made changes in core/profile.py as documented in issue #331. Let me know what you think. If you feel it has a place in the pylinac framework I'll carry on working on it and implement the rest of the parameters, write appropriate tests and document it.

Regards Alan

coveralls commented 3 years ago

Coverage Status

Coverage decreased (-2.7%) to 80.887% when pulling 253874352d400bb8729280e1b5d2fd0179e9ff1f on alanphys:fieldparams into 863fce0382f740cba90760f554b6d210fc44efe3 on jrkerns:master.

coveralls commented 3 years ago

Coverage Status

Coverage decreased (-2.7%) to 80.887% when pulling 253874352d400bb8729280e1b5d2fd0179e9ff1f on alanphys:fieldparams into 863fce0382f740cba90760f554b6d210fc44efe3 on jrkerns:master.

coveralls commented 3 years ago

Coverage Status

Coverage decreased (-5.8%) to 77.064% when pulling 18b5645015e49adad39bc2f16aae4a302ea5d326 on alanphys:fieldparams into 0232234230b2450961cfe9208e585e371bfe46c9 on jrkerns:master.

jrkerns commented 3 years ago

Hey Alan, As usual, I'm pretty behind with this project so I'm finally getting around to this. Sorry. Generally speaking, the goal of this PR seems to be the same as the original flatsym module so splitting into multiple classes seems like an odd design choice from the users' perspective. There's also a lot of repeated code in the new class. I'd be far more in favor of just updating the flatsym class to use the protocol structure suggested. I'll leave comments in the PR. If I find time soon I'll also push up some commits on the branch for styling and some other stuff.

alanphys commented 3 years ago

Hi James

Thanks for looking at this. My original intention was to simply extend the flatsym module, but I immediately ran into problems with the structure in that some parameters such as field edge and size were hard coded while others such as flatness and symmetry were protocol dependent. Having thought about it I decided it would be easier and simpler to create a new module. This would make it easier evaluate and also allow the original flatsym module to be retained for backwards compatibility. I originally called the new module flatsym2 but it was soon obvious that I was moving past flatness and symmetry only and so renamed it to fieldparams.

The flatsym module has done excellent service, but in this day and age of FFF machines I certainly need more, and I believe pylinac will gain wider acceptance by supporting FFF.

I've got the inflection points for FFF working and as soon as I get a chance to finish testing I'll commit those changes.

You are welcome to retain, merge or drop this module. For me it is not a wasted effort as I was using the opportunity to develop a new algorithm for BeamScheme. I'd just like to know what you want to do before I go to the trouble of preparing documentation.

Hope this clarifies some of the thinking behind my decisions. Regards Alan

alanphys commented 3 years ago

PS: I'm not quite sure what code is redundant, but I was trying to limit the amount of code that needed to be pushed to a lower level. You are welcome to mark any redundant code. I certainly agree there is a lot of scope for improvement in the structure and style of the module.

jrkerns commented 3 years ago

Thanks for the responses Alan. I think what you're trying to add and the goal of pylinac is the same; thus I'd like to keep this as one module. You have found the limitations of my algorithm =). Would you be kind enough to give me some of your low-res datasets? I didn't have any when this was built so such scenarios were not on my radar and obviously didn't get tested either. Believe it or not, the velocity of pylinac should pick up here in a bit. 😉

jrkerns commented 3 years ago

What may make better sense for both of us is for you to add the relevant methods to the profile module that allow for better calculation of FFF and array datasets. If you'd like to make another profile class that's fine. Underlying modules I'm much looser about updates/changes. Given I want to update the flatsym module to keep it all in sync that's my problem. I will attempt to modify your PR to make it all work. We can work on it together on this branch until we're both happy. I'll start by just making/adjusting the skeleton of the module (class names, protocols, method names and parameters). If that is in line with the needs of FFF/low-res then we/I can do the grunt work after that.

alanphys commented 3 years ago

I have uploaded testsmall.dcm and the MapCheck2 file it was derived from to your DropBox. I have PTW and IBA files available at https://github.com/alanphys/BeamScheme/tree/master/TestFiles but I haven't converted these to a format pylinac can import yet. I suppose at some stage we will have to write import routines, although I think QATrack+ might have already done this.

alanphys commented 3 years ago

I've achieved what I set out to do, that is implementing FFF parameters. In fact I've landed up doing a lot more than I intended, but there is still an enormous amount of work in terms of validation, optimising the code and reducing redundancy, writing documentation, and adding the final bells and whistles. However, I will hold off until you decide if, when and how you want to integrate this module.

Regards Alan

jrkerns commented 3 years ago

Alan, thanks so much for your hard work! I'm definitely excited to have a better profile and F&S algorithm. This is definitely the direction this needs to head. Can you do 2 things to finish this off? 1) Remove the .idea/* files from the PR and 2) Make a PR against a new branch (not master). I want to dig into this with some other data I recently got as well as yours. I'm trying to follow the feature branch git flow recently (like, really recently =D).

jrkerns commented 3 years ago

So I've finally carved out some time for this. The report you linked was very helpful and I like the generalizations they use to characterize flat and FFF beams. I was not able to understand some of the equations as they didn't appear fully described (e.g. the variables of eqn 6 and 7 are missing or they are the same as defined in eqn 3?). Is the hill function as you've coded it the same as equation 3? They appear different at first glance. The fitted curve also seems to not fit very well for at least a regular flat beam; the slope does not seem to be equal to the actual beam: hill

alanphys commented 3 years ago

Hi James. That's great. I've been on some much needed leave for the last two weeks so I am only looking at your comments now. Responses as follows: 1) I'll certainly remove the .idea file. I see you have already done this. 2) Which branch would you like me to issue the pull request against? I don't think this is ready to make it into v2.5. 3) The parameters for equations 6 and 7 in NCS 33 are the same as equation 3. Equation 6 is equation 3 solved for the inflection point (max slope) and equation 7 is equation 3 solved at (a + d)/2 . 4) My equation is a slightly more general version of the Hill function which in turn belongs to the class of sigmoid functions. When I implemented it in BeamScheme I used the LMath library which already had a Hill regression using this equation and I carried on using this equation when I did the pylinac implementation. The parameter equivalence is roughly as follows:

Pylinac NCS 33 Description
a d sigmoid low level
b - a a sigmoid high level
c b approximate inflection point
d -c slope of the sigmoid

I re-derived the inflection point position and tested against Bistromath which uses the NCS 33 equation and got similar values.

5) During my testing of the Hill modelling I observed some variation in the inflection point (~0.1mm) and considerable variation in the slope of the inflection point depending on the extent of the penumbra selected. Theo van Soest in the help of his Bistromath program notes " that the slope is much more sensitive to the quality of the fit and the chosen fit range." The fit of the slope is better if the data range is restricted to the slope but the regression was failing on some data sets. Also the determination of the 50% value became inaccurate.

Much work remains determining the optimum penumbra extent for modelling.

In conclusion, in addition to optimising the modelling I need to:

If there is anything else you feel should be included or can be done to improve the module please feel free to suggest.

Regards Alan

HeerNMeester commented 3 years ago

Hi James and Alan,

I'm following this tread for a while now, and I think it is time to participate now NCS33 is mentioned. I'm a physicist at the Netherlands Cancer Institute in Amsterdam focussing on Dosimetry and (Linac) QA, and I'm a member of the NCS33 subcommittee.

I'm happy to see that NCS33 is picked up by the community. Theo van Soest and myself wrote the basis of this part of the report. This prepublication was intented to be used in new software versions of QA equipment, so that when the full report is published, one can use the parameters as quickly as possible. Theo has incorporated the parameter calculation in his BistroMath software, which is benchmarked against my own StarReader software I wrote to analyse and automate PTW's StarCheck files. Unfortunately, due to IP rights, I cannot open source that. But I can supply someone who is interrested with a build for research purposes. When we got acces to FFF beams in 2010, we expanded some parameters to trend our FFF beams. This was the basis of the NCS33 parameters.

We have 14 years of experience with the Hill function, or, a four parameter non-linear regression model, as we call it. When we started using the PTW StarCheck back in 2007, we noticed some strange variation in field size calculation and we ended up in fitting the penumbra with this function. Theo later applied this also in his software. The basic reason for the field size variation is explained in appendix 3B of the NCS report. As the prepublication was written in a hurry (due to the demand from the community), the full report will go deeper into the parameters and the tests we performed with it.

I'm not a python programmer myself yet (I still use Delphi), so I wont participate in pylinac soon. But if you have any questions about NCS33 or other QA-trend related subjects, feel free to ask.

Kind regards, Thijs Perik The Netherlands Cancer Institute, Amsterdam https://www.linkedin.com/in/thijsperik/

jrkerns commented 3 years ago

@alanphys Thanks for the info. I kept at it yesterday and eventually did figure out the parameter names were the same throughout. I was never able to get a fit of an EPID profile that was anywhere near good, but I realized this is likely due to the overresponse of the EPID at the lower energies out of the field that caused the lower sigmoid to overrespond. I tried the hill function, eqn 3 from NCS, and the generalized logistic function, but none of them fit the data well regardless of the bounds or starting position I provided. I haven't tried this with device data (e.g. Profiler). My guess is it will work much better since there is not an overresponse outside the field. The hill function and sigmoid functions all seem to assume the lower and upper sigmoid have the same curve shape, which isn't true for EPID. I really thought the generalized logistic function would work, but it failed terribly. I even tried with log(n) of the pixel values, but nothing has worked so far.

Re: point 2, I made a new branch, fieldparams, that we can contribute to outside of master until it's ready. It's just your branch with my commit to remove the .idea files so far.

Re: point 5, my observations were the same. It seems pretty sensitive to the initial data provided.

Given pylinac's focus is the EPID thus far, I really need to have something stable for the EPID. I am adding device support so I could potentially just use the model for devices. EPIDs already have high resolution to be begin with so I'm not clear the advantages of a model for a high-res detector.

@HeerNMeester Thanks for making yourself available. Can you let us know when the full report is published? Thanks.

HeerNMeester commented 3 years ago

@jrkerns The 4PL function is expandeble with a 5th parameter which includes asymmetry in the function, maybe that helps with the EPID calculations. See: https://www.mathworks.com/matlabcentral/fileexchange/38043-five-parameters-logistic-regression-there-and-back-again We found it too complex for low resolution profiles. As you already mentioned the low energy response of the EPID causes problems. Also the saturation of the panel with higher doserates (the old Elekta panel) was reason to focus only on ionchamber/array data

I will let you know when the report is available, but that might take another year or more.

jrkerns commented 3 years ago

@HeerNMeester Thanks for the tip. The hill, 4pl, and 5pl appear to be special cases of the generalized logistic function. I definitely wanted that to work, but I did not get satisfactory results. Some plots are attached (first is a large x-domain; the second is a narrowed down x-domain; the 5pl function crashed). If I can figure out some weighting mechanic this might help. For the immediate future I will investigate this w/r/t device arrays, which do not over-respond. Assuming that works, I will apply the model to device arrays and simply interpolate for EPID and high-density devices until I can figure out a good model. Thanks for the info! fits narrow

alanphys commented 3 years ago

@HeerNMeester many thanks for your valuable input. If you have more info to share on your experience using the Hill model that would be great.

@jrkerns I think perhaps you are wanting too much from the Hill model. The object of using the Hill function is to get a more robust estimator for the inflection point or point of steepest gradient not to model the plateau or tail region. As a secondary objective we want the fit to be good over the 0.4 x Inflection point and 1.6 x Inflection point (20%-80%) region so we can calculate the penumbra width. Looking at your first graph above I feel the 4 parameter Hill function is doing this adequately.

The question is how good is good enough? Do we need to include some indicator of reliability? In my initial testing I did fit an EPID image of a WFF field (below): Test4 This had an R-squared of 0.998. Although the fit is not good in the plateau and tail regions I felt the penumbra region was okay and 0.998 for a model is pretty good. I found the variation in the position of the inflection point depending on what data was selected to model was small.

The issue of the inflection point slope is separate. Although NCS 33 recommends the slope as an estimator of the penumbra it doesn't specify its calculation. I just got carried away and added it. To me the average slope between the 1.6Infl and 0.4Infl points would be a more reliable estimator.

The beauty of the fieldparams algorithm is that adding a new parameter to calculate the steepest gradient or average slope is simple and defining a new protocol using these parameters is even easier. There is no reason why we can't define a protocol as NCS33 and another as FFF(Fogliata) etc. and let the user use what they want.

At any rate there is still a lot of testing that needs to be done and I completely understand, James, that you want something that is reliable and accurate before putting it out there.

Regards Alan

jrkerns commented 3 years ago

Alan, thanks for the work. After some further digging around, the Hill/4PLR does seem to do adequate, at least for the purposes you stated. It would seem to potentially break down at extreme penumbra (10/90, 5/95), but that seems a rare scenario. Given that SNC does simple linear interpolation and physicists have been using that for a number of years all this seems like splitting hairs by comparison. 😂

I definitely like the appeal of a model-based approach that beautifully sidesteps the problem of resolution. What I'm working on now is a modification of the SingleProfile with constructor parameters to let the user decide the method of fitting (Hill, linear interp, cubic interp), normalization (beam center, device center, max), etc. As you stated, we can add more later as desired. For the flat/sym module, if the image is assumed to be EPID, I will default to something like cubic interpolation for the time being, with parameters to change if desired. For a low-res device the default will be the Hill function (again, with parameters to change it). I'll add something like a FlatSym.from_device() constructor class that will allow the user to load a device file and assume those default fitting/norm parameters.

I can see a user comparing their default software of the device against pylinac and thinking there's a bug because the values are different. For that reason, allowing the user to change the parameters to suit their needs (maybe accuracy or maybe agreement with known methods) seems to make sense.

Thoughts?

alanphys commented 3 years ago

Hi James. Sounds good, although for evaluating the NCS33 parameters I am using a sigmoid model for the inflection point, a second order polynomial for the top of the profile and linear interpolation for the dose points. Would you be able to cater for all three of these? Your use scenario is indeed the case. I am aware of a few people that have compared pylinac to existing software and found differences. Resolving these, I believe, will go a long way to establishing the credibility of pylinac. Regards Alan

kegge13 commented 3 years ago

@HeerNMeester many thanks for your valuable input. If you have more info to share on your experience using the Hill model that would be great.

Regards Alan

As author of BistroMath and a member of the NCS-33 subcommitee I offer my help to improve pyLinac also. However, I don't speak Python momentarily.

jrkerns commented 3 years ago

@alanphys So far, I've changed SingleProfile to have far more constructor parameters and removed a lot of the interpolate=True parameters from the methods, as even before this PR I was unhappy with trying to handle it all. The signature is looking like so:

    def __init__(self, values: np.ndarray, dpmm: float = None,
                 interpolation: Interpolation = Interpolation.LINEAR,
                 ground: bool = True,
                 interpolation_resolution_mm: float=0.1,
                 interpolation_factor: float=10,
                 normalization_method: Normalization = Normalization.BEAM_CENTER,
                 edge_detection_method: Edge = Edge.FWHM,
                 edge_smoothing_ratio: float=0.003):

Interpolation can be none, linear, cubic, normalization can be none, max, beam, geometric (i.e. middle pixel), and edge detection can be FWHM, inflection via derivative (i.e. no fitting but can still handle FFF beams), and inflection via Hill (your PR basically). Between these combinations I think it will be able to handle several scenarios. The methods have been simplified (e.g. penumbra(lower, upper)) but obviously take into account the constructor arguments. Additionally, the methods will return dictionaries for the most part to cut down on the side and kind arguments and just include it all.

As for the "top", I like the 2nd order poly fit so far, but obviously only needs to be called for FFF beams. Additionally, NCS33 says to analyze it across the middle 5cm (is this always at iso? Seems depth, FS, and energy-dependent), but this already assumes a field of at least 5cm (which I cannot assume. Not sure why you'd analyze an FFF <5cm, but still) and that I have the dpmm (probably will be true but again can't assume).

I'm not really a fan of the 20/50/80 dose points as it seems more sensitive to noise. What I've done to try and combine it all is to use an in-field ratio (what NCS calls in-field area; I don't like the term "area" as it can be conflated with the symmetry term) and a "top" region ratio and calculate the slopes on either side of the field but excluding the top region. E.g. for a 10cm field with passed in parameters of an in-field ratio of 0.8 and top region of 0.4, the region from +/-4cm (10cm0.8/2) to +/-2cm (10cm0.4/2) respectively on each side will get a linear regression fit of the slope using all points. The slope is unlikely to be a straight line exactly but I don't think it's any worse than "evaluation points" and is less sensitive to noise. The region from -2cm to +2cm will be used to evaluate the "top" using the polynomial fit.

To answer your original question Alan, I think the three topics you brought up are dealt with above. To retain compatibility with other software, the constructor parameters (which will also be added to the analyze signature of the flatsym/fieldparam module) will allow the user to choose the best parameters for them. W/R/T protocols, using these parameters will mean that only one field center, penumbra, etc are calculated depending on which parameters are chosen. The protocols now just basically include the symmetry and flatness definitions, although I'm making it extensible so that custom calculations can be done on the profiles.

jrkerns commented 3 years ago

Little preview utilizing the Hill function and also the derivative inflection: flatsym Figure_1

jrkerns commented 3 years ago

@HeerNMeester or @kegge13 Is there a permalink to the NCS-33 report so I can link it from pylinac documentation?

kegge13 commented 3 years ago

A link to the NCS-33 prepublication is https://radiationdosimetry.org/files/Prepublication_-_NCS_Report_33_Beam_parameters_V2020-07-29.pdf. We are working on a much better text. The proposal to use the central 5 cm for the FFF Top model is a practical one: it represents pretty good the curved area. Indeed, this will not work for smaller fields but those field lack FFF-shape anyway. In BistroMath there are limits for a minimal field size and a minimal dose drop over the IFA to be "FFF".

kegge13 commented 3 years ago

Little preview utilizing the Hill function and also the derivative inflection: flatsym Figure_1

Nice presentation of results. Indeed the parameters do offer a wide range of strategies. In the images the penumbra area looks quite narrow. How is it defined? I assume you use the real inflection point as I discussed earlier with Alan. In these examples it seems that epid data are used. When water phantom data or equivalent are used mostly the origin is well defined by the user. I assume such an origin can also be used for normalisation and center definition? Recently I also introduced a "near origin" definition which collapses to the origin when sufficiently small and otherwise uses a border based center.

kegge13 commented 3 years ago

@alanphys So far, I've changed SingleProfile to have far more constructor parameters and removed a lot of the interpolate=True parameters from the methods, as even before this PR I was unhappy with trying to handle it all. The signature is looking like so:

    def __init__(self, values: np.ndarray, dpmm: float = None,
                 interpolation: Interpolation = Interpolation.LINEAR,
                 ground: bool = True,
                 interpolation_resolution_mm: float=0.1,
                 interpolation_factor: float=10,
                 normalization_method: Normalization = Normalization.BEAM_CENTER,
                 edge_detection_method: Edge = Edge.FWHM,
                 edge_smoothing_ratio: float=0.003):

Interpolation can be none, linear, cubic, normalization can be none, max, beam, geometric (i.e. middle pixel), and edge detection can be FWHM, inflection via derivative (i.e. no fitting but can still handle FFF beams), and inflection via Hill (your PR basically). Between these combinations I think it will be able to handle several scenarios. The methods have been simplified (e.g. penumbra(lower, upper)) but obviously take into account the constructor arguments. Additionally, the methods will return dictionaries for the most part to cut down on the side and kind arguments and just include it all.

As for the "top", I like the 2nd order poly fit so far, but obviously only needs to be called for FFF beams. Additionally, NCS33 says to analyze it across the middle 5cm (is this always at iso? Seems depth, FS, and energy-dependent), but this already assumes a field of at least 5cm (which I cannot assume. Not sure why you'd analyze an FFF <5cm, but still) and that I have the dpmm (probably will be true but again can't assume).

I'm not really a fan of the 20/50/80 dose points as it seems more sensitive to noise. What I've done to try and combine it all is to use an in-field ratio (what NCS calls in-field area; I don't like the term "area" as it can be conflated with the symmetry term) and a "top" region ratio and calculate the slopes on either side of the field but excluding the top region. E.g. for a 10cm field with passed in parameters of an in-field ratio of 0.8 and top region of 0.4, the region from +/-4cm (10cm_0.8/2) to +/-2cm (10cm_0.4/2) respectively on each side will get a linear regression fit of the slope using all points. The slope is unlikely to be a straight line exactly but I don't think it's any worse than "evaluation points" and is less sensitive to noise. The region from -2cm to +2cm will be used to evaluate the "top" using the polynomial fit.

To answer your original question Alan, I think the three topics you brought up are dealt with above. To retain compatibility with other software, the constructor parameters (which will also be added to the analyze signature of the flatsym/fieldparam module) will allow the user to choose the best parameters for them. W/R/T protocols, using these parameters will mean that only one field center, penumbra, etc are calculated depending on which parameters are chosen. The protocols now just basically include the symmetry and flatness definitions, although I'm making it extensible so that custom calculations can be done on the profiles.

I'm a fan of the derivative function because that offers a base to find penumbra regions. And with more statistical analysis it will also find them for wedged profiles and FFF. This enables a reliable application of the Hill function without assuming anything.

jrkerns commented 3 years ago

I'm a fan of the derivative function because that offers a base to find penumbra regions.

That's exactly what I ended up doing 😄.

Nice presentation of results. Indeed the parameters do offer a wide range of strategies. In the images the penumbra area looks quite narrow. How is it defined?

Exactly as NCS-33 says. Probably just the zoomed-out view that makes it narrow.

I assume you use the real inflection point as I discussed earlier with Alan.

I think for that figure it was the Hill function, but again, I will give the option to the user to use Hill (i.e. 4PL), 2nd derivative, or the traditional FWHM.

When water phantom data or equivalent are used mostly the origin is well defined by the user. I assume such an origin can also be used for normalisation and center definition?

I will have normalization options of None, geometric center (middle pixel), max, and beam center (for offset or wedge fields). I don't have water tank data so I'm unable to verify how the origin works there. Currently, I don't have an option for center definition; it's just the middle point between the field edges.

Here's a better image of an FFF beam using the Hill/4PL function with the left panel zoomed in: FFF

jrkerns commented 3 years ago

Closing this as it is duplicated by https://github.com/jrkerns/pylinac/pull/382 and includes the commits here.