I decided in the end that this is not needed because the gradient "self-normalizes", e.g., the gradient is the same even after multiplying all the entries of the likelihood matrix by a constant factor (100 in this example):
set.seed(1)
n <- 1e5
m <- 10
w <- rep(1/n,n)
L <- simulatemixdata(n,m,normalize.rows = FALSE)$L
# L <- 100*L
x <- rep(1/m,m)
u <- w / c(L %*% x)
g <- -c(t(L) %*% u)
I decided in the end that this is not needed because the gradient "self-normalizes", e.g., the gradient is the same even after multiplying all the entries of the likelihood matrix by a constant factor (100 in this example):