Rosemeis / pcangsd

Framework for analyzing low depth NGS data in heterogeneous populations using PCA.
GNU General Public License v3.0
47 stars 11 forks source link

Feature Request: Improved Convergence Checking #74

Open ajasonowicz opened 1 year ago

ajasonowicz commented 1 year ago

Hi,

I recently ran into an issue where the maximum number of iterations during MAF estimation was reached but the tolerance criteria was not, since the tolerance check is nested in a for loop here: https://github.com/Rosemeis/pcangsd/blob/8832548da0a353d06a58dc361a28dbea7decc426/pcangsd/shared.py#L21-L26 The program proceeded with the covariance matrix estimation without reaching the tolerance criteria for MAF estimation. You might consider revising the code to ensure that the tolerance criteria are met before the program proceeds and that the results of the convergence checks are relayed to the user (especially if they fail). I re-wrote portions of the code relevant to the features of pcangsd I am using with a while loop and a exit if the check fails. Here is an example of what I came up with for the checks in emMAF in shared.py:

  i = 0
  diff = 1
  while (i <= iter) & (tole < diff):
      shared_cy.emMAF_update(L, f, t)
      diff = shared_cy.rmse1d(f, f_prev)

      i +=1
      f_prev = np.copy(f)
      print(f"iter:{i:,d}, diff:{diff:0.8f}")

  if diff < tole:
      print(f"EM (MAF) converged at iteration:  {i} iterations.")

  else:
      print(
          f"!!!WARNING!!! EM (MAF) did not converge. Consider increasing --maf_iter (current value {iter})\n...exiting."
      )
      sys.exit()

I noticed a similar structure for the convergence checks in covariance.py and modified that as well. I hope you find this suggestion useful for future development efforts of this great program!

Cheers

Rosemeis commented 1 year ago

Hi,

Thanks for the good suggestions. I have now added print statements in case the different features would not converge (v1.2). All EM algorithms now have acceleration schemes such that convergence should not be an issue anymore. :-)

Best, Jonas

ajasonowicz commented 1 year ago

Awesome, thanks a lot Jonas! I can test out the new version with a set of parameters that has caused me some issues in the past.

Cheers, Andy