galacticusorg / galacticus

The Galacticus galaxy formation model
GNU General Public License v3.0
24 stars 16 forks source link

`integral kernel is non-positive` in excursion set solver #192

Closed abensonca closed 3 years ago

abensonca commented 3 years ago

Running the cole2000WellOrdering test suite code results in an error:

Fatal error:
integral kernel is non-positive - possibly the effective barrier is too large
 Occurred at:
   subroutine:farahiMidpointRateTabulate
         file:structure_formation.excursion_sets.first_crossing_distribution.Farahi.midpoint.F90   [line 649]
  => Error occurred in thread            6

when tabulating first crossing rates.

Using git bisect shows that the commit that introduced this regression was b5b335e632ccb07ebd3be7a1d56a497a7766d7d5.

@Xiaolong-Du including you here in case you have any insights into why this may have caused a problem in the excursion set solver? Also, could you check if this problem still occurs when using your new implementation of the excursion set solver using the built-in erf function? Note that to test it you need to remove $GALACTICUS_DATA_PATH/dynamic/largeScaleStructure/excursionSets/excursionSetsWdmMx1.5Gx1.5.hdf5 otherwise it may use a pre-existing tabulation and the bug won't be triggered. I'll be looking into this problem too.

abensonca commented 3 years ago

Possibly the problem is that in source/structure_formation.excursion_sets.first_crossing_distribution.Farahi.midpoint.F90 this section of code:

                do i=1,self%varianceTableCountRate
                   massProgenitor            =+cosmologicalMassVariance_%mass        (real(+varianceTableRateQuad   (i)+varianceTableRateBaseQuad(iVariance),kind=8),self%timeTableRate(iTime)                        )
                   growthFactorEffective     =+cosmologicalMassVariance_%rootVariance(      massProgenitor                                                          ,self%timeTableRate(iTime)                        ) &
                        &                     /cosmologicalMassVariance_%rootVariance(      massProgenitor                                                          ,     timeProgenitor                              )
                   barrierTableRateQuad   (i)=+excursionSetBarrier_     %barrier     (real(+varianceTableRateQuad   (i)+varianceTableRateBaseQuad(iVariance),kind=8),     timeProgenitor      ,node,rateCompute=.true.) &
                        &                     *growthFactorEffective
                   massProgenitor            =+cosmologicalMassVariance_%mass        (real(+varianceMidTableRateQuad(i)+varianceTableRateBaseQuad(iVariance),kind=8),self%timeTableRate(iTime)                        )
                   growthFactorEffective     =+cosmologicalMassVariance_%rootVariance(      massProgenitor                                                          ,self%timeTableRate(iTime)                        ) &
                        &                     /cosmologicalMassVariance_%rootVariance(      massProgenitor                                                          ,     timeProgenitor                              )
                   barrierMidTableRateQuad(i)=+excursionSetBarrier_     %barrier     (real(+varianceMidTableRateQuad(i)+varianceTableRateBaseQuad(iVariance),kind=8),     timeProgenitor      ,node,rateCompute=.true.) &
                        &                     *growthFactorEffective
                end do

the variance is passed to cosmologicalMassVariance_%mass() instead of the root-variance. Testing this now.

abensonca commented 3 years ago

Unfortunately that does not solve the problem.....

Xiaolong-Du commented 3 years ago

@abensonca I saw this problem before. Usually, I just set the first crossing rate to zero if this happens. But it may need more detailed investigation. I guess what happens here is that the size of the first step is too small, or the time step size for calculating the rate is too large. The erfApproximate function returns 1 when the argument is larger than 8.75, which gives a zero integralKernelRate. Replacing erfApproximate with the build-in error function may help in certain cases, but this usually leads to inaccurate results because integralKernelRate is extremely small. I will run some tests on this issue and let you know if I find a solution.

abensonca commented 3 years ago

At least in this case increasing varianceNumberPerUnit from 40 to 80 allows it to run to completion. This makes it use a smaller step in variance which makes the difference in barrier heights in the problem step smaller.

So, a possible "solution" for now is just to have it issue a help message if this error occurs suggesting to increase varianceNumberPerUnit. It could also attempt to make that increase automatically, but it's not obvious to me if that would always work.

abensonca commented 3 years ago

I've tried a few more cases and the only thing that seems to work reliably is to set the first crossing rate to zero in these cases. Since these calculations should be improved by @Xiaolong-Du 's refactoring to use erfc() then I think a temporary fix which does this "set to zero" is good enough for now. @Xiaolong-Du - do you agree?

abensonca commented 3 years ago

Fixed by e64b4b1df82c5871da9e1beafc7c571883ccd868 - closing.