Open jhpoelen opened 7 years ago
And some more from spark job monitor -
Turns out that the /mnt/data was running out of space. (97% of 1.8T). I've cleaned out some kafka logs and made a tiny change in idigio-spark job to reduce memory-disk pressure to avoid big jobs (like update monitors) to stall. @jhammock @mjcollin @godfoder Suggest to move forward on #4 to avoid duplicate maintenance efforts.
With 2 copies of GBIF (~600M 2), 4 copies of iDigBio (~60M 4), 6 copies of (~2M * 6), updating the monitors takes over a week. Adding more capacity (see #4) would definitely help this issue. Also, time can be spent on optimizing the algorithms used to calculate differences across ~1.5G occurrence records).
from http://archive.guoda.bio -