wfau / ScienceArchives

0 stars 0 forks source link

Improve pipeline for matched aperture photometry #509

Open wfastrononomer opened 6 months ago

wfastrononomer commented 6 months ago

Pipeline not well laid out - works out catalogues to be processed, but doesn't use this and then recalculates in class. Mostly fixed

Better logging needed at ingest stage to show how much left - to reduce unnecessary logging and add % done and ETA.

tms-epcc commented 6 months ago

10/APR/24

tms-epcc commented 5 months ago

08/MAY/24 @wfastrononomer reported

tms-epcc commented 4 months ago

05/JUN/24

tms-epcc commented 3 months ago

11/JUL/24

tms-epcc commented 2 months ago

see also #589

tms-epcc commented 2 months ago

14/AUG/24 @wfastrononomer reported

wfastrononomer commented 2 months ago

Run on 50 - says none needing to run extractor, filterCat, classify, but some e.g. 3rd one - catalogueID=2138 - have bitProcessingFlag=0

ET: 4610984 /disk63/vsa/ingest/fits/20171211_v1.5/v20171211_00397_st.fit /disk 62/sys/test/products/map/20240719_v6/v20171211_00397_st_map160_2_1.fits 274 FIDD: ['stack'] DET: 4482841 /disk64/vsa/ingest/fits/20170611_v1.5/v20170611_00129_st.fit /disk 62/sys/test/products/map/20240719_v6/v20170611_00129_st_map160_2_1.fits 3474 FIDD: ['stack'] DET: 4471580 /disk64/vsa/ingest/fits/20170619_v1.5/v20170619_00032_st.fit /disk 62/sys/test/products/map/20240719_v6/v20170619_00032_st_map160_2_1.fits 2138 FIDD: ['stack'] DET: 4371150 /disk64/vsa/ingest/fits/20170423_v1.5/v20170423_00198_st.fit /disk 62/sys/test/products/map/20240719_v6/v20170423_00198_st_map160_2_1.fits 3233 FIDD: ['stack'] DET: 4353311 /disk64/vsa/ingest/fits/20170331_v1.5/v20170331_00193_st.fit /disk 62/sys/test/products/map/20240719_v6/v20170331_00193_st_map160_2_1.fits 3117 FIDD: ['stack'] DET: 4351897 /disk65/vsa/ingest/fits/20170305_v1.5/v20170305_00194_st.fit /disk 62/sys/test/products/map/20240719_v6/v20170305_00194_st_map160_2_1.fits 625 FIDD: ['stack']

I rerun with more print out and I get different results in the process layers - why? Had I forgot to send previous code changes to unas?

wfastrononomer commented 1 month ago

select bitProcessingFlag,isIngested,COUNT(*) from MapFrameStatus where programmeID=160 and mapID=2 group by bitProcessingFlag,isIngested

| | bitProcessingFlag | isIngested | | |

|>| 0 | 0 | 1738 |<| |>| 1792 | 0 | 2684 |<| |>| 33824 | 1 | 28 |<| |>| 1073741824 | 0 | 142 |<| |>| 1073741824 | 1 | 25 |<|

Why are there 28 frames with isIngested=1 and bitProcessingFlag<1073741824? All multiframeID<0 --> averaged combicats. Make sure these aren't selected earlier...

select bitProcessingFlag,isIngested,COUNT(*) from MapFrameStatus where programmeID=160 and mapID=2 and multiframeID>0 group by bitProcessingFlag,isIngested

| | bitProcessingFlag | isIngested | | |

|>| 0 | 0 | 991 |<| |>| 1792 | 0 | 2684 |<| |>| 1073741824 | 0 | 142 |<| |>| 1073741824 | 1 | 25 |<|

wfastrononomer commented 1 month ago

2024-08-21 19:27:27.48: Progress = 1% ETC (latest) Thu Aug 22 04:21:19 2024

(overall) Thu Aug 22 04:21:19 2024

2024-08-21 19:54:54.82: System Failure: IOError: [Errno 2] No such fil

e or directory: '/disk62/sys/test/products/map/20240719_v6/v20170421_00118_st_ma p160_2_1.fits'

Traceback (most recent call last): File "/unas/njc/VDFS/ListDrivenClean/src/curation/wsatools/DbConnect/CuSession .py", line 525, in run self._onRun() File "/unas/njc/VDFS/ListDrivenClean/src/curation/invocations/cu9/cu9.py", lin e 692, in _onRun self.prepCatHeaders(frameDet, normSoftDict) File "/unas/njc/VDFS/ListDrivenClean/src/curation/invocations/cu9/cu9.py", lin e 1165, in prepCatHeaders prov = pu.Provenance(self.archive, mapCatName) File "/unas/njc/VDFS/ListDrivenClean/src/curation/wsatools/ProvenanceUtils.py" , line 77, in init self._queryTypeFrameInfo('conf' if isConf else 'normal') File "/unas/njc/VDFS/ListDrivenClean/src/curation/wsatools/ProvenanceUtils.py" , line 131, in _queryTypeFrameInfo self.components) File "/unas/njc/VDFS/ListDrivenClean/src/curation/wsatools/ProvenanceUtils.py", line 498, in getWholeProv components = getFileProv(db, filePath) File "/unas/njc/VDFS/ListDrivenClean/src/curation/wsatools/ProvenanceUtils.py", line 512, in getFileProv for hdu in fits.open(filePath): File "/unas/njc/VDFS/ListDrivenClean/src/curation/wsatools/FitsUtils.py", line 1171, in open lazy_load_hdus=False, kwds) File "/unas/scos/usr/local/lib/python2.7/site-packages/astropy/io/fits/hdu/hdulist.py", line 148, in fitsopen lazy_load_hdus, kwargs) File "/unas/scos/usr/local/lib/python2.7/site-packages/astropy/io/fits/hdu/hdulist.py", line 402, in fromfile lazy_load_hdus=lazy_load_hdus, *kwargs) File "/unas/scos/usr/local/lib/python2.7/site-packages/astropy/io/fits/hdu/hdulist.py", line 1006, in _readfrom fileobj = _File(fileobj, mode=mode, memmap=memmap, cache=cache) File "/unas/scos/usr/local/lib/python2.7/site-packages/astropy/utils/decorators.py", line 539, in wrapper return function(args, **kwargs) File "/unas/scos/usr/local/lib/python2.7/site-packages/astropy/io/fits/file.py", line 150, in init self._open_filename(fileobj, mode, overwrite) File "/unas/scos/usr/local/lib/python2.7/site-packages/astropy/io/fits/file.py", line 506, in _open_filename self._file = fileobj_open(self.name, IO_FITS_MODES[mode]) File "/unas/scos/usr/local/lib/python2.7/site-packages/astropy/io/fits/util.py", line 395, in fileobj_open return open(filename, mode)

2024-08-21 19:54:54.82: Rolling back uncommited database transactions...

2024-08-21 19:54:55.28: Updating archive curation history with this row:

2024-08-21 19:54:55.28: (15941, 37, '', 'NONE', '2024-08-21T18:54:55.28', 'njc', '[ULTRAVISTA] A descriptive comment', 0)

2024-08-21 19:54:55.28: completed CU37: Creation of Matched Aperture FITS catalogues

File definitely missing.

select * from MapFrameStatus where catName like '%v20170421_00118_st_map160_2_1.fits'

| | programmeID | multiframeID | mapID | cuEventID | ppErrBitsStatus | catName | versionNum | deprecated | bitProcessingFlag | filteredImageName | catalogueID | isIngested | |

|>| 160 | 4395301 | 2 | 14441 | 12648656 | /disk62/sys/test/products/map/20240719_v6/v20170421_00118_st_map160_2_1.fits | 1 | 0 | 0 | NONE | 2333 | 0 |<|

bitProcessingFlag = 0 --> so no extraction or anything like that, so why was it trying prepCatHeaders? Add more checks in before there. I started that with filterCat, but haven't completed it.

wfastrononomer commented 1 month ago

hasProcessBeenRun( ..., checkPrev) --> checkPrev was True only if part of group processes.... Set True for all.

wfastrononomer commented 1 month ago

2024-08-22 10:01:14.47: imcore_list /disk63/vsa/ingest/fits/20140131_v1.5/v201

40131_00265_st.fit /disk63/vsa/ingest/fits/20140131_v1.5/v20140131_00265_st_conf .fit /disk62/sys/test/products/temp/20240822/sourceListultravista20240822_2_4.as cii /disk62/sys/test/products/map/20240719_v6/v20140131_00265_st_map160_2_1.fits 1.25 --rcore 3.0 --cattype 6 --noell

2024-08-22 10:28:49.95: Updating MapFrameStatus with extractor bit, for 439530

1,4427845,4630423,4631945,4843602,4843617,4844472,4844654,4846448,4846608,4846747,4846763,4846840,4846956

2024-08-22 10:28:52.53: Programming Error: IndexError: list index out of range

Traceback (most recent call last): File "/unas/njc/VDFS/ListDrivenClean/src/curation/wsatools/DbConnect/CuSession.py", line 525, in run self._onRun() File "/unas/njc/VDFS/ListDrivenClean/src/curation/invocations/cu9/cu9.py", line 517, in _onRun processLayersDict, plIndex, frameDetails, groupList) File "/unas/njc/VDFS/ListDrivenClean/src/curation/invocations/cu9/cu9.py", line 805, in checkForEmptyCatalogues frameInfo, isParts = processLayersDict[plType][plIndx]

2024-08-22 10:28:52.53: Rolling back uncommited database transactions..

wfastrononomer commented 1 month ago

Seems that last change to checkPrev broke this. This was the wrong change, but unsure why it tried to run prevCatHeaders on something that hadn't produced a catalogue!

wfastrononomer commented 1 month ago

Again:

2024-08-24 09:42:41.28: System Failure: IOError: [Errno 2] No such fil

e or directory: '/disk62/sys/test/products/map/20240719_v6/v20100508_00171_st_ma p160_2_1.fits'

Traceback (most recent call last): File "/unas/njc/VDFS/ListDrivenClean/src/curation/wsatools/DbConnect/CuSession .py", line 525, in run self._onRun() File "/unas/njc/VDFS/ListDrivenClean/src/curation/invocations/cu9/cu9.py", lin e 692, in _onRun self.prepCatHeaders(frameDet, normSoftDict) File "/unas/njc/VDFS/ListDrivenClean/src/curation/invocations/cu9/cu9.py", line 1166, in prepCatHeaders prov = pu.Provenance(self.archive, mapCatName) File "/unas/njc/VDFS/ListDrivenClean/src/curation/wsatools/ProvenanceUtils.py", line 77, in init self._queryTypeFrameInfo('conf' if isConf else 'normal') File "/unas/njc/VDFS/ListDrivenClean/src/curation/wsatools/ProvenanceUtils.py", line 131, in _queryTypeFrameInfo self.components) File "/unas/njc/VDFS/ListDrivenClean/src/curation/wsatools/ProvenanceUtils.py", line 498, in getWholeProv components = getFileProv(db, filePath) File "/unas/njc/VDFS/ListDrivenClean/src/curation/wsatools/ProvenanceUtils.py" , line 512, in getFileProv for hdu in fits.open(filePath): File "/unas/njc/VDFS/ListDrivenClean/src/curation/wsatools/FitsUtils.py", line 1171, in open lazy_load_hdus=False, kwds) File "/unas/scos/usr/local/lib/python2.7/site-packages/astropy/io/fits/hdu/hdulist.py", line 148, in fitsopen lazy_load_hdus, kwargs) File "/unas/scos/usr/local/lib/python2.7/site-packages/astropy/io/fits/hdu/hdulist.py", line 402, in fromfile lazy_load_hdus=lazy_load_hdus, *kwargs) File "/unas/scos/usr/local/lib/python2.7/site-packages/astropy/io/fits/hdu/hdulist.py", line 1006, in _readfrom fileobj = _File(fileobj, mode=mode, memmap=memmap, cache=cache) File "/unas/scos/usr/local/lib/python2.7/site-packages/astropy/utils/decorators.py", line 539, in wrapper return function(args, **kwargs) File "/unas/scos/usr/local/lib/python2.7/site-packages/astropy/io/fits/file.py", line 150, in init self._open_filename(fileobj, mode, overwrite) File "/unas/scos/usr/local/lib/python2.7/site-packages/astropy/io/fits/file.py", line 506, in _open_filename self._file = fileobj_open(self.name, IO_FITS_MODES[mode]) File "/unas/scos/usr/local/lib/python2.7/site-packages/astropy/io/fits/util.py ", line 395, in fileobj_open return open(filename, mode)

2024-08-24 09:42:41.28: Rolling back uncommited database transactions...

2024-08-24 09:42:41.72: Updating archive curation history with this row:

2024-08-24 09:42:41.72: (16121, 37, '', 'NONE', '2024-08-24T08:42:41.72', 'njc', '[ULTRAVISTA] A descriptive comment', 0)

2024-08-24 09:42:41.83: completed CU37: Creation of Matched Aperture FITS catalogues

2024-08-24 09:42:41.84: for programme 160: UltraVISTA

2024-08-24 09:42:41.84: Updating programme curation history with this row:

2024-08-24 09:42:41.84: (16121, '2024-08-24T08:42:41.84', 0)

2024-08-24 09:42:41.85: THERE WERE ERRORS (see above)

2024-08-24 09:42:41.86: Disconnected from database: testVSAnjcUVDR6

SystemExit: see log testVSAnjcUVDR6_MapCurate_20240822_1222.log

wfastrononomer commented 1 month ago

Same issue with prepCatHeaders for non-extracted file

2024-08-24 03:44:07.77: Updating headers...

2024-08-24 03:51:53.78: Progress = 1% ETC (latest) Sat Aug 24 13:47:21 2024

(overall) Sat Aug 24 13:47:21 2024

2024-08-24 04:47:16.35: Progress = 10% ETC (latest) Sat Aug 24 13:58:23 2024

(overall) Sat Aug 24 13:57:08 2024

2024-08-24 05:49:20.68: Progress = 20% ETC (latest) Sat Aug 24 14:11:19 2024

(overall) Sat Aug 24 14:04:52 2024

2024-08-24 06:52:01.21: Progress = 30% ETC (latest) Sat Aug 24 14:16:11 2024

(overall) Sat Aug 24 14:09:32 2024

2024-08-24 07:56:47.58: Progress = 40% ETC (latest) Sat Aug 24 14:11:56 2024

(overall) Sat Aug 24 14:10:24 2024

2024-08-24 09:21:32.81: Progress = 50% ETC (latest) Sat Aug 24 16:29:00 2024

(overall) Sat Aug 24 14:56:04 2024

2024-08-24 09:42:41.28: System Failure: IOError: [Errno 2] No such fil

e or directory: '/disk62/sys/test/products/map/20240719_v6/v20100508_00171_st_ma p160_2_1.fits'

This one also has bitProcessingFlag=0. Shouldn't be in processLayers for prepCatHeaders. Do tests with calcProcessLayers to see why this might happen.

wfastrononomer commented 1 month ago

I think the problem is in calcProcessLayers / hasProcessBeenRun.

hasProcessBeenRun removes frames that have not had the previous step run, but calcProcessLayers uses Indices - which are then used on the original set of frames.

Better for hasProcessBeenRun to set a flag...

wfastrononomer commented 1 month ago

select bitProcessingFlag,isIngested,COUNT(*) from MapFrameStatus where programmeiD=160 and mapID=2 and multiframeID>0 group by bitProcessingFlag,isIngested

| | bitProcessingFlag | isIngested | | |

|>| 1073741824 | 2 | 2 |<| |>| 1073741824 | 0 | 3838 |<| |>| 1073741824 | 1 | 2 |<|

Where do isIngested = 2 values come from?

CatalogueIDs 720 and 1200. These had this issue before, and I reset them. Why do these ones keep having this issue. Where is it being set?

isIngested=2 if bad catalogue files to avoid ingesting. Let's check these. catNames = 'NONE'. Why? Delete these two entries and rerun - see if we get the same result.

wfastrononomer commented 1 month ago

2024-08-28 09:23:40.10: Inserting matched aperture data from 3840 files for mapID=2 into ultravistaMapRemeasurementRaw...

2024-08-28 09:23:40.10: Creating fits filelist dict...

2024-08-28 09:23:40.10: Checking files of productType=stack

2024-08-28 09:38:39.00: Removing for empty catalogues

2024-08-28 09:38:39.04: System Failure: OSError: [Errno 2] No such file or directory: 'NONE'

Traceback (most recent call last): File "./curation/invocations/cu9/MapCurate.py", line 112, in if cli.getOpt("redoCats") else None)) File "/unas/njc/VDFS/ListDrivenClean/src/curation/wsatools/MapAutomator.py", line 234, in mapCurate extractType, usePawprints, mapID) File "/unas/njc/VDFS/ListDrivenClean/src/curation/wsatools/MapUtils.py", line 139, in createIngestFile os.remove(catName)

wfastrononomer commented 1 month ago

2024-09-01 14:07:41.21: [29751] Ingest file left on file share: //unas/samba/share/cu34id29751_unas-1_testVSAnjcUVDR6_casu_ultravistaMapRemeasurementRaw.dat1

2024-09-01 14:07:41.21: mx.ODBC.Error.IntegrityError: The INSERT statement conflicted with the FOREIGN KEY constraint "fk_ultravistaMapRemeasurementRaw_catalogueID_to_MapFrameStatus_catalogueID". The conflict occurred in database "testVSAnjcUVDR6", table "dbo.MapFrameStatus", column 'catalogueID'.

Script executed: BULK INSERT ultravistaMapRemeasurementRaw FROM '\UNAS\share\cu34id29751_unas-1_testVSAnjcUVDR6_casu_ultravistaMapRemeasurementRaw.dat1' WITH (firstrow=1, datafiletype='native', tablock, check_constraints, order(catalogueID,apertureID))

Traceback (most recent call last): File "/unas/njc/VDFS/ListDrivenClean/src/curation/wsatools/DbConnect/IngIngester.py", line 511, in tryIngest ingestCU(ingLogFile, histInfo, ingestOrder, fileInfo) File "/unas/njc/VDFS/ListDrivenClean/src/curation/wsatools/DbConnect/IngIngester.py", line 479, in ingestCU4 isCu3=False, cuEventID=ingLogFile.cuEventID) File "/unas/njc/VDFS/ListDrivenClean/src/curation/wsatools/DbConnect/IngIngester.py", line 872, in ingest deleteFile=False, isOrdered=not isCu3) File "/unas/njc/VDFS/ListDrivenClean/src/curation/wsatools/DbConnect/DbSession.py", line 2356, in ingestTable self._executeScript(sql, wantRowCount=True) File "/unas/njc/VDFS/ListDrivenClean/src/curation/wsatools/DbConnect/DbSession.py", line 1933, in _executeScript self._raiseWithScript(error, script) File "/unas/njc/VDFS/ListDrivenClean/src/curation/wsatools/DbConnect/DbSession.py", line 2063, in _raiseWithScript raise err.class(message + "\nScript executed: " + script)