Open aakash10gupta opened 3 days ago
Thanks for pointing these out! @aakash10gupta and I went through these and determined most of these are large vector operations with NumPy that are taking a long time. Solutions will be finding ways to optimize these vector manipulations. Just leaving code blocks below so I remember where to look when dealing with this.
scaling gradient to absolute model perturbations
logger.info("scaling gradient to absolute model perturbations") gradient.update(vector=gradient.vector * model.vector) gradient.write(path=os.path.join(self.path.eval_grad, "gradient"))
exporting gradient to disk
logger.info("exporting gradient to disk")
src = os.path.join(self.path.eval_grad, "gradient")
dst = os.path.join(self.solver.path.output, "gradient")
unix.cp(src, dst)
calculating search direction 'p_new' from gradient
g_new = self.load_vector("g_new")
p_new = g_new.copy()
p_new.update(vector=-1 * self._precondition(g_new.vector))
INITIALIZE LINE SEARCH
gtg = dot(g.vector, g.vector)
gtp = dot(g.vector, p.vector)
m = self.load_vector("m_new") # current model
p = self.load_vector("p_new") # current search direction
norm_m = max(abs(m.vector))
norm_p = max(abs(p.vector))
alpha = self.step_len_init * norm_m / norm_p
# The new model is the old model plus a step with a given magnitude
m_try = self.load_vector("m_new").copy()
p = self.load_vector("p_new") # current search direction
dm = alpha * p.vector # update = step length * step direction
logger.info(f"updating model with `dm` (dm_min={dm.min():.2E}, "
f"dm_max = {dm.max():.2E})")
m_try.update(vector=m_try.vector + dm)
try: misfit increasing, attempting to reduce step length using parabolic backtrack
slope = self.gtp[-1] / self.gtg[-1]
try: first step of iteration, setting scaled step length
alpha = self.step_lens[idx] * self.gtp[-2] / self.gtp[-1]
writing optimization stats: '/....../output_optim.txt'
grad_norm_L1 = np.linalg.norm(g.vector, 1) # L1 norm of gradient
grad_norm_L2 = np.linalg.norm(g.vector, 2) # L2 norm of gradient
Here, I'm listing a few generic log messages below their respective state headers extracted from a sflog.txt file, after which the next log message is printed after a considerable amount of time, possibly due to the long running time of the processes executed in between the log messages, which can potentially be further optimized with regards to the running time needed -
EVALUATE GRADIENT FROM KERNELS
scaling gradient to absolute model perturbations
exporting gradient to disk
calculating search direction 'p_new' from gradient
INITIALIZE LINE SEARCH
INITIALIZE LINE SEARCH
step length 'alpha' = ......
UPDATE LINE SEARCH
try: misfit increasing, attempting to reduce step length using parabolic backtrack
try: first step of iteration, setting scaled step length
writing optimization stats: '/....../output_optim.txt'