SasView / sasview

Code for the SasView application.
BSD 3-Clause "New" or "Revised" License
52 stars 41 forks source link

Discrepancies in Corfunc output #1382

Open smk78 opened 5 years ago

smk78 commented 5 years ago

Elliot Gilbert has been playing with Correlation Function analysis in SasView (in 4.x) in an attempt to understand what impact systematically varying aspects of the scattering data have on the derived correlation function and, more importantly, the extracted parameters that quantify the underlying lamellar morphology.

When @rprospero's corfunc-py code was assimilated into SasView I did test it against the old CCP13 code. I have done so again using the BSL \convertible_files\Z83000 & Z98000 and their regular \1d_data\ISIS_83404 & ISIS_98929 equivalents and, whilst the extrapolations and transforms compare pretty favourably, they are not identical. But then I would not expect them to actually be identical since the two implementations are not using identical code/libraries.

However, there is indeed some variation in the extracted parameters (in the following SV=SasView & CF=CCP13 CORFUNC). Testing the Z98000/ISIS_98929 data we have:

Background values of 0.3085 (SV) vs 0.3095 (CF) - not too bad at all. Hard block thickness is 17.1 (SV) vs 13.9 (CF) – close enough?

But then:

Long period is 31.6 (SV) vs 76 (CF) – I think here the SV extract algorithm has picked on a little hump in the base of the big minima in Gamma1(x), not present in the CF transform, rather than the big peak/hump around 70-80. In essence the code needs to look for a change in slope and in the SV case it's been misled. Interface thickness is 4.6 (SV) vs 2.0 (CF) Core thickness is 9.7 (SV) vs 4.0 (CF) *But I regard the fact that these are 2.000000000000 and 4.00000000000 as deeply suspicious!!! Local crystallinity is 0.54 (SV) vs 0.18 (CF) – but since this is calculated as hardblock thickness/long period it is bound to be different.

A lot of this will be down to the subtleties of the differences in the actual shapes of the 1D CF because of the way the way the lamellar morphology parameters are extracted. But we should satisfy ourselves that the code is doing what we expect.

sasview-v-corfunc CF1 CF3 IDF

smk78 commented 4 years ago

After further testing, it appears that both 4.2.2 and 5.0.2 mostly give ~same extracted parameters as the CCP13 CORFUNC (as noted above absolutely identical parameters are not expected). What has, however, been highlighted, is the much greater impact that the choice of high-Q extrapolation fitting range and background level can have on the shape of the 1D correlation function.

image

image

Parameter__________________________CCP13 CF__________4.2.2__________5.0.2
Average Long Period                76.00             74.07          74.1
Average Hard Block Thickness       13.95             12.56          12.6
Average Interface Thickness        2.00              0.00           0.0
Average Core Thickness             4.00              2.64           2.65
Local Hard Block Fraction          0.183             0.169          0.17
Polydispersity                     0.223             0.330          0.331

Whilst I still regard the Interface & Core Thicknesses from the CCP13 CF as suspicious, the Interface Thickness from SasView needs investigation, as does the fairly significant discrepancy in the Polydispersity.

smk78 commented 4 years ago

In CCP13 CF, the Polydispersity is computed as:

C       Search for the first minimum.
        DO i=2,numd-1
          IF ((gamma1(i) .LT. gamma1(i-1)) .AND.
     +    (gamma1(i) .LT. gamma1(i+1))) THEN
            IF (gamma1(i) .LT. 0.) THEN
              r3=xgamma(i)
              gammamin=-gamma1(i)
              minchannel=i
              lamflag=1
              GOTO 2020
            ENDIF
          ENDIF
        END DO

C       Search for local maxm.
2020    DO i=minchannel,(numd-1)
          IF ((gamma1(i) .GT. gamma1(i-1)) .AND.
     +     (gamma1(i) .GT. gamma1(i+1))) THEN
             IF (gamma1(i) .GT. 0.) THEN
               lp=xgamma(i)
               gammamax=gamma1(i)
               maxchannel=i
               lamflag=lamflag+1
               GOTO 2030
             ENDIF
           ENDIF
         END DO

C         reslam(nframe,8)=Polydispersity
C         (= gamma at first min / gamma at first max)
          reslam(nframe,8)=gammamin/gammamax

(Correct: that is not Python! :-) )

smk78 commented 4 years ago

In SasView CF, gammamin is computed in the same way:

        GammaMin = y[mins[0]]  # The value at the first minimum

but gammamax is computed by a more convoluted procedure (but possibly more rigorous):

https://github.com/SasView/sasview/blob/ESS_GUI/src/sas/sascalc/corfunc/corfunc_calculator.py Lines 175-237

butlerpd commented 4 years ago

❓ 🙉 that does not compute 😄 sensing there may be a corfunc paper in your future at this rate....

smk78 commented 4 years ago

Aside: The Stribeck reference in the Correlation Function help docs/tutorial notes (Eqn 8.68) that:

        GammaMin = - v / (1 - v)

where v is the hard block fraction (= hard block thickness / long period). So this could be used as an independent check on a value of gammamin determined by a different method.

smk78 commented 4 years ago

❓ 🙉 that does not compute 😄 sensing there may be a corfunc paper in your future at this rate....

There might be in Elliot's...

Finally found the time to capture the things we have learnt over the last couple of weeks!

smk78 commented 4 years ago

In respect of the 'ripples' in the IDF, user Yimin Mao notes:

I guess when the user subtracts too much 'flat baseline', the tail gets noisy--as it should be. Then the smoothing algorithm tries to produce a smooth curve using those subtracted data, and it contains a long-wavelength component (it really should be flat). This long wavelength component produces ripples after FT. Ruland has treated in detail the raw data corrections. The constant baseline was attributed to the density fluctuation associated with liquid-type compressibility. Its value was obtained by measuring the SAXS data in a relatively broad q-range, until seeing an increase of intensity at the high-q. The baseline was then extrapolated using ~q^-2 to q=0, and the intercept is the constant baseline. This is shown in Fig. 5a in the ref below. The I_FL term is actually slightly lower than the apparent flat baseline. Refs: W. Ruland. The evaluation of the small-angle scattering of lamellar two-phase systems by means of interface distribution functions. Colloid & Polymer Sci. 255, 417-427 (1977). Stribeck, N. & Ruland, W. Determination of the Interface Distribution Function of Lamellar Two-Phase Systems. J. Appl. Cryst. (1978) 11 535-539.

Stribeck also notes the limitations of the 'graphical' approach to extracting information from CFs and has proposed a more comprehensive - though mathematically indigestable - approach. Refs: Stribeck, N. X-Ray Scattering of Soft Matter, Springer. Berlin (2007), 138-161 & Stribeck, N. Analysis of scattering from polydisperse structure using Mellin convolution. J. Appl. Cryst. (2006) 39 237–243

lucas-wilkins commented 2 years ago

Plan of action:

1) Compare behaviour FORTRAN code above against current calculation of polydispersity. Check it does the same thing. 2) Look closely at extract_parameters in sascalc.corfunc.corfunc_calculator.CorfuncCalculator c.f. https://github.com/scattering-central/CCP13/blob/master/corfunc/fortran/corfunc/extract_par.f

I'll see what I find!

lucas-wilkins commented 2 years ago

Need to find out how to run CC13, the windows function has gone.

lucas-wilkins commented 1 year ago

@smk78 Might be worth looking at this now there has been some changes.