wfau / ScienceArchives

0 stars 0 forks source link

Why does cu34 ingest take so long and seem to slow down? #556

Open wfastrononomer opened 3 weeks ago

wfastrononomer commented 3 weeks ago

Each FITS file has a few tens to hundreds of thousands of sources, but only a few 10s get ingested a day. Why is this so slow.

log files: e.g. /disk78/genOps/vsa/logs/TestVSAnjcUVDR6/cu0id3201.log /disk78/genOps/vsa/logs/TestVSAnjcUVDR6/cu34id6191.log

esutorius commented 20 hours ago

A standard CU4 ingest into the VHS takes ~20 sec for ~500000 detections for all 3 tables. For the VVV we have 20 million rows into Raw in 7 mins, and into Photometry and Astrometry in ~1 min each. These ingest are split up into 2 millon row chunks since we hit a wall at 3-4 million rows. The Raw and Astrometry tables have the ca. same width (76/75 and 14/17 columns for VVV/TestUV, resp.) but the Photometry is much wider 50/125, resp. So my suspicions are wrong indices and/or unsorted data, Photometry table too wide (?), curation server memory too small (run ingests of same data on huni), database server overloaded with too many testDBs (clean-up).