ORNL-QCI / ExaTENSOR

Basic numerical tensor algebra library for distributed heterogeneous HPC platforms
BSD 3-Clause "New" or "Revised" License
16 stars 4 forks source link

Wrong values and NaN if intermediates are created and destroyed repeatedly #5

Open jpoto opened 1 year ago

jpoto commented 1 year ago

In my code I used intermediates I3/I4

      ierr=exatns_tensor_create(I3,"I3",id%vvoo,root%vvoo,EXA_DATA_KIND_C8)
      ierr=exatns_tensor_create(I4,"I4",id%vvoo,root%vvoo,EXA_DATA_KIND_C8)
      ierr=exatns_tensor_init(I3,ZERO)
      ierr=exatns_tensor_init(I4,ZERO)
      ierr=exatns_tensor_contract("T(e,a,k,n)+=X+(e,c,j,k)*Y(c,a,j,n)",I3,t2_v,t2_o)
      if (ierr.ne.0) call quit('laplace_get_T: contraction T12 wrong')
      ierr=exatns_tensor_contract("G(e,a,k,n)+=X+(a,b,i,e)*Y(b,n,k,i)",I4,vvov_v,vooo_o)
      if (ierr.ne.0) call quit('laplace_get_T: contraction G11 wrong')
      ierr=exatns_tensor_contract("R()+=T(e,a,k,n)*G(e,a,k,n)",T_tensor,I3,I4,ONE_QUARTER)
      if (ierr.ne.0) call quit('laplace_get_T: contraction 23 wrong')
      ierr=exatns_tensor_destroy(I3)
      ierr=exatns_tensor_destroy(I4)

and reused the intermediates in similar pieces of code.

At runtime I got NaN. Now I have grouped them and use only one create and destroy and the NaN's vanished. I still get a larger than expected error, but regrouping should solve it.

Therefore: Avoid creating and destroying tensors too often.

jpoto commented 1 year ago

I commented out the contraction giving NaN, but then the next contraction gave NaN.

jpoto commented 1 year ago

I now circumvent this error by creating all intermediates in the beginning of the routine and destroying them at the end of it.