wfau / ScienceArchives

0 stars 0 forks source link

Ingest - why are only 654 out of 3800 files ingested. How are ingest files setup? Is this a fieldID issue. Update createIngestFiles - also include more in each file list for improved efficiency! #613

Open wfastrononomer opened 2 months ago

wfastrononomer commented 2 months ago

Something very wrong still with ingests!

tms-epcc commented 1 month ago

11/SEP/24

tms-epcc commented 3 weeks ago

16/OCT/24

wfastrononomer commented 2 weeks ago

testVSAnjcUVDR6> select bitProcessingFlag,isIngested,COUNT(*) from MapFrameStatus where programmeID=160 and mapID=2 and multiframeID>0 group by bitProcessingFlag,isIngested

| | bitProcessingFlag | isIngested | | |

|>| 1073741824 | 0 | 7 |<| |>| 1073741824 | 1 | 3835 |<|

Rows returned: 2 (execution time: 0.121155 s) testVSAnjcUVDR6> select COUNT_BIG(*) from ultravistaMapRemeasurementRaw

| | | |

|>| 817504568 |<|

Rows returned: 1 (execution time: 352.789 s) testVSAnjcUVDR6> select COUNT_BIG(*) from ultravistaMapRemeasurementAstrometry ^[[A--------------------------- | | | |

|>| 830701165 |<|

Rows returned: 1 (execution time: 149.451 s) testVSAnjcUVDR6> select COUNT_BIG(*) from ultravistaMapRemeasurementPhotometry

| | | |

|>| 816713497 |<|

wfastrononomer commented 2 weeks ago

Now 99.8% have been ingested. However not all parts equally - Raw/Astrometry/Photometry are not equal.

rawCats = db.query("distinct catalogueID","ultravistaMapRemeasurementRaw")

len(rawCats) 3774 astCats = db.query("distinct catalogueID","ultravistaMapRemeasurementAstrometry") phtCats = db.query("distinct catalogueID","ultravistaMapRemeasurementPhotometry") len(astCats) 3836 len(phtCats) 3771 allCats = db.query("catalogueID","MapFrameStatus","programmeID=160 and mapID=2 and multiframeID>0") len(allCats) 3842 allIng = set(rawCats).intersection(set(astCats)).intersection(set(phtCats)) len(allIng) 3758

97.8% completed.

notFullIngest = set(allCats).difference(allIng)

notFullIngest = sorted(list(notFullIngest)) notFullIngest [56L, 117L, 150L, 181L, 185L, 240L, 265L, 267L, 284L, 335L, 439L, 504L, 531L, 703L, 736L, 819L, 874L, 947L, 1033L, 1057L, 1080L, 1220L, 1255L, 1372L, 1386L, 1461L, 1486L, 1714L, 1720L, 1763L, 1815L, 1956L, 2019L, 2026L, 2089L, 2515L, 2547L, 2605L, 2703L, 2710L, 2716L, 2720L, 2752L, 2766L, 2842L, 2861L, 2871L, 2907L, 2973L, 2997L, 3006L, 3016L, 3025L, 3068L, 3118L, 3140L, 3167L, 3246L, 3352L, 3405L, 3419L, 3436L, 3499L, 3560L, 3644L, 3645L, 3734L, 3816L, 4263L, 4360L, 4390L, 4420L, 4492L, 4830L, 4859L, 5259L, 5444L, 6015L, 6063L, 6100L, 6264L, 6304L, 6334L, 6559L, 6942L]

wfastrononomer commented 2 weeks ago

misMatches [(131, {'Raw': 214769, 'Pht': 216540, 'Ast': 216540}), (181, {'Raw': 212969, 'Pht': 218688, 'Ast': 218688}), (1074, {'Raw': 219837, 'Pht': 214978, 'Ast': 219837}), (1258, {'Raw': 215417, 'Pht': 215417, 'Ast': 220385}), (1302, {'Raw': 221680, 'Pht': 217909, 'Ast': 217909}), (1372, {'Raw': 217073, 'Pht': 218069, 'Ast': 218069}), (1944, {'Raw': 217801, 'Pht': 217801, 'Ast': 210608}), (2065, {'Raw': 218379, 'Pht': 212154, 'Ast': 212154}), (2181, {'Raw': 219295, 'Pht': 220055, 'Ast': 219295}), (2384, {'Raw': 222792, 'Pht': 212260, 'Ast': 212260}), (2698, {'Raw': 212747, 'Pht': 212747, 'Ast': 215487}), (2980, {'Raw': 221060, 'Pht': 221060, 'Ast': 220744}), (3006, {'Raw': 217900, 'Pht': 218158, 'Ast': 218158}), (3475, {'Raw': 215788, 'Pht': 219211, 'Ast': 219211}), (3550, {'Raw': 212919, 'Pht': 212919, 'Ast': 218395})]

[131, 181, 1074, 1258, 1302, 1372, 1944, 2065, 2181, 2384, 2698, 2980, 3006, 3475, 3550]

Check FITS files

wfastrononomer commented 2 weeks ago

catalogueID = 131 fileName = '/disk62/sys/test/products/map/20240719_v6/v20150514_00079_st_map160_2_1.fits' import numpy as np

np.sum([hdu.data.size for ii, hdu in enumerate(hdulist) if ii>0]) 214769

Raw looks correct, so why are there 1771 more in Astrometry and Photometry

select apertureID from ultravistaMapRemeasurementAstrometry where mapID=2 and catalogueID=131 and apertureID not in (select apertureID from ultravistaMapRemeasurementRaw where mapID=2 and catalogueID=131) Rows returned: 165526 (execution time: 0.187392 s)

This is much greater than difference in numbers - looks like values have come from somewhere else.

Best delete ingests from these catalogues and rerun.

wfastrononomer commented 1 week ago

select COUNT(distinct catalogueID) from ultravistaMapRemeasurementRaw where mapID=2 and catalogueID>0

| | | |

|>| 3841 |<|

Rows returned: 1 (execution time: 309.854 s) testVSAnjcUVDR6> select COUNT(distinct catalogueID) from ultravistaMapRemeasurementAstrometry where mapID=2 and catalogueID>0

| | | |

|>| 3842 |<|

Rows returned: 1 (execution time: 126.872 s) testVSAnjcUVDR6> select COUNT(distinct catalogueID) from ultravistaMapRemeasurementPhotometry where mapID=2 and catalogueID>0

| | | |

|>| 3842 |<|

Rows returned: 1 (execution time: 500.217 s) testVSAnjcUVDR6> select COUNT(*) from ultravistaMapRemeasurementPhotometry where mapID=2 and catalogueID>0

| | | |

|>| 832233482 |<|

Rows returned: 1 (execution time: 499.79 s) testVSAnjcUVDR6> select COUNT(*) from ultravistaMapRemeasurementAstrometry where mapID=2 and catalogueID>0

| | | |

|>| 832233482 |<|

Rows returned: 1 (execution time: 126.503 s) testVSAnjcUVDR6> select COUNT(*) from ultravistaMapRemeasurementRaw where mapID=2 and catalogueID>0

| | | |

|>| 832023417 |<|

Rows returned: 1 (execution time: 310.06 s)

wfastrononomer commented 1 week ago

Still one messed up?

select * from MapFrameStatus where programmeID=160 and mapID=2 and multiframeID>0 and isIngested=0

| | programmeID | multiframeID | mapID | cuEventID | ppErrBitsStatus | catName | versionNum | deprecated | bitProcessingFlag | filteredImageName | catalogueID | isIngested | |

|>| 160 | 6410466 | 2 | 55711 | 12648656 | /disk62/sys/test/products/map/20241014_v1/v20190321_00139_st_map160_2_1.fits | 1 | 0 | 1073741824 | NONE | 6100 | 0 |<|

wfastrononomer commented 1 week ago

In testVSAnjcUVDR6 all good. But needed to delete and reprocesses some catalogues.