Rather than using global parameters for the constants, I suggest hardcoding them in the expression. That is, add softcore_alpha=0.5; softcore_a=1; softcore_b=1; softcore_c=6 to your expression. That will let it optimize much more effectively. For example, it will do the optimization ((1)-(lambda))^(-1+(softcore_b)) = (1-lambda)^(-1+1) = (1-lambda)^0 = 1. With that change, the nans go away.
From @peastman:
cc https://github.com/openmm/openmm/issues/2914#issuecomment-724182279