desihub / qlf

Developement of the QLF framework for DESI
BSD 3-Clause "New" or "Revised" License
4 stars 4 forks source link

QL changes in PR#693 #225

Closed Srheft closed 6 years ago

Srheft commented 6 years ago

@felipelm

felipelm commented 6 years ago

Hi @Srheft I will fix test this PR and fix the problem later today

sbailey commented 6 years ago

Thanks @Srheft and @felipelm . When this is working with desispec branch ql_updateQAs (for desihub/desispec#693), please show a successful run of several science exposures on desi-1.

Srheft commented 6 years ago

Hi @felipelm

[quicklook@desi-1 run_quicklook]$ ll $DESI_CCD_CALIBRATION_DATA/SIM

-rw-r--r-- 1 root root 10883520 Sep 17 10:10 fiberflat-b0.fits -rw-r--r-- 1 root root 10883520 Sep 17 10:10 fiberflat-b1.fits ... Empty array errors are usually the result of wrongly pointed calibs [we had one instance during the mock obs week] or not digestible files due to permission. Using the calibs above, I got the same SkySub_QL error in the sandbox as you see in the interface. But I got no error at all once I used the same but differently permissioned calibs for quicklook user in the sandbox:

[quicklook@desi-1 run_quicklook]$ll /n/home/quicklook/sarahE/desi/ccd_calibration_data/SIM

-rw-rw-r-- 1 quicklook quicklook 10883520 Sep 27 00:21 fiberflat-b0.fits -rw-rw-r-- 1 quicklook quicklook 10883520 Sep 27 00:19 fiberflat-b1.fits ... -rw-rw-r-- 1 quicklook quicklook 1270080 Sep 26 14:26 psf-b0.fits -rw-rw-r-- 1 quicklook quicklook 1270080 Sep 26 14:26 psf-b1.fits ...

Please change the permission on the calibration files and if they are the updated calibs, the last step should become up and running. You can just copy over the same ones I used [/n/home/quicklook/sarahE/desi/ccd_calibration_data/SIM].


felipelm commented 6 years ago

@Srheft I just fixed all permission errors and ran the pipeline on desi-1.

The key changes broke some petals and we noticed that /data/quicklook/exposures/raw/20191001/00003578/fibermap-00003578.fits doesn't have RA_OBS and DEC_OBS that we used on several plots.

Was there a change to another key name?

To test your changes on desispec just update the repository and run ./restartBackend.sh

Srheft commented 6 years ago

@felipelm The static plots that QL generates only use 'RA' and 'DEC' so we did not have any issue with that change in the fibermaps. you may want to pass on this question to Stephen. Let me know if I did not get what you meant. There might be changes in keys of fibermap and it is possible that because we didn't use them, we did not notice them.

Srheft commented 6 years ago

@felipelm Does initializing backend take a very long time or I have not done something correct here: [quicklook@desi-1 ]$cd backend/desispec/py/desispec/ [quicklook@desi-1 desispec]$ git pull remote: Enumerating objects: 12, done. remote: Counting objects: 100% (12/12), done. remote: Compressing objects: 100% (6/6), done. remote: Total 12 (delta 4), reused 8 (delta 4), pack-reused 0 Unpacking objects: 100% (12/12), done. From https://github.com/desihub/desispec 7d9fb14..bbc9c92 ql_updateQAs -> origin/ql_updateQAs Updating 7d9fb14..bbc9c92 Fast-forward doc/changes.rst | 4 +++- py/desispec/io/fiberflat.py | 3 ++- py/desispec/io/fluxcalibration.py | 3 ++- py/desispec/io/frame.py | 3 ++- py/desispec/io/sky.py | 3 ++- py/desispec/io/spectra.py | 3 ++- py/desispec/qa/qa_plots_ql.py | 2 +- py/desispec/qproc/io.py | 34 +++++++++++++++++----------------- py/desispec/test/test_io.py | 2 ++ 9 files changed, 33 insertions(+), 24 deletions(-)

[quicklook@desi-1 ]$ cd /software/qlf/master [quicklook@desi-1 master]$ ./restartBackend.sh Stopping master_app_1 ... done Recreating master_redis_1 ... done Recreating master_app_1 ... done Recreating master_nginx_1 ... done [quicklook@desi-1 master]$ docker exec -it master_app_1 bash groups: cannot find name for group ID 8003 I have no name!@c2679c20459f:/software/qlf/master/backend$ ./run.sh ./run.sh: line 2: [: =: unary operator expected Setting desispec... Setting desiutil... Setting desimodel... Setting desisim... Setting desitarget... Setting specter... Operations to perform: Apply all migrations: admin, auth, authtoken, contenttypes, dashboard, sessions, ui_channel Running migrations: No migrations to apply. /conda/lib/python3.6/site-packages/psycopg2/init.py:144: UserWarning: The psycopg2 wheel package will be renamed from release 2.8; in order to keep installing from binary please use "pip install psycopg2-binary" instead. For details see: http://initd.org/psycopg/docs/install.html#binary-install-from-pypi. """) Initializing QLF Daemon...


Initializing QLF Backend...

felipelm commented 6 years ago

@Srheft it is running. Try reloading http://desi-1.kpno.noao.edu:9000/monitor-realtime

Srheft commented 6 years ago
screen shot 2018-09-27 at 11 27 00 am

This is what I see. the log of camera b9 shows that the last step ran successfully. Is it the key updates that are not making the colouring to happen?

sbailey commented 6 years ago

Regarding the files in /software/ql/ccd_calibration_data/SIM:

Srheft commented 6 years ago

@sbailey

Fact: QL does not rewrite/manipulate any calb files or any other input. This can be easily verified by looking at the code obviously. Here is the story again: Using the calibs that were copied [on sep 17th] in /sofware/ql/ccd_calibration_data/SIM, I was getting the same empty array failure in the sandbox that Felipe reported from within the qlf in his first trial of PR#693. Just to check if it's relevant, I copied the same calibs from our machine at SMU to the sandbox at desi-1 and changed my local $DESI_CCD_CALIBRATION_DATA to read the calibs from the sandbox not from /software/ql and that removed the failure! so since I saw the only difference be the ownership of the calibs, I suggested to Felipe do that change and that seemed to have helped them too.

I do not know if something else made the error to go away but doing above was all i did so i attributed it to permission requirement for reading/opening the fits file, etc.

Srheft commented 6 years ago

@sbailey

I just diverted the sandbox to use the calibs in /software/ql/ and because of the changes of this morning, everything runs fine. Could you change the permissions of calibs to how they should be and then have us try again? thanks.

felipelm commented 6 years ago

the log of camera b9 shows that the last step ran successfully. Is it the key updates that are not making the colouring to happen?

Yes

Srheft commented 6 years ago

@felipelm ok thanks. I'll be pushing a commit in PR#693 to help remove the red petals from step 1 but are you waiting on me to do something for the key mismatch or the ball is in your court now?

sbailey commented 6 years ago

I just restored /software/ql/ccd_calibration_data to the correct versions in svn. Neither the data in SIM/ nor SIM_OLD/ were correct.

If QL doesn't work with this version of $DESI_CCD_CALIBRATION_DATA=/software/ql/ccd_calibration_data, then please open a separate ticket: either something needs to be fixed in QL, or something needs to be updated in those files, but please don't change those files in-place.

felipelm commented 6 years ago

@sbailey

Why were these changed?!? For both QL and QLF they should be treated as read-only inputs and never updated by them. They come via svn, and should only be updated externally. It seems like we still have a fundamental misunderstanding about how these files are supposed to be used.

How did they become owned by root? We shouldn't even have root access on those machines. Since docker runs as root, it appears that something within the docker containers is updating these files. That is very very bad.

If you ran the container without CURRENT_UID the container would run as root. I removed this possibility making uid and gid fixed to the ones on quicklook user on desi-1 a while back.

We don't change or use fiberflat files in qlf. Except when running desispec code that I'm not really familiar with the details.

sbailey commented 6 years ago

RA_OBS, DEC_OBS: that fibermap file /data/quicklook/exposures/raw/20191001/00003578/fibermap-00003578.fits is unchanged since mock observing (when it also didn't have RA_OBS, DEC_OBS). i.e. this isn't a change related to all the recent PRs.

Unfortunately the fibermap format has been in a confusing state for too long: The old format did have RA_OBS, DEC_OBS but when negotiating the format details with ICS we agreed to change it to FIBER_RA, FIBER_DEC which is what we used for mock observing. I thought that is what QLF was using for its development and testing, but it appears that you've been developing against a different set of test files. For developing and testing QL/F please use the mock observing test files in /data/quicklook/exposures/raw/.

We're working on updating the simulation pipeline to generate fibermap files in the new (i.e. mock observing) format and that is the top remaining issue for the 18.9 software release. In the meantime quicklook has able to support both old and new formats since desispec.io.read_fibermap auto-detects the new format and calls fibermap_new2old to convert back to the old format in memory so that the quicklook code could still use it.

If you have other fibermap columns that you thought should exist but are missing from the mock observing format fibermap files, see desispec.io.fibermap.fibermap_new2old for the mapping of column names. But please write QLF to "think" in the new format natively.

Srheft commented 6 years ago

@felipelm Shall I continue to edit qlf mapping codes in this PR to make the colouring issues in the steps 1 and 4 to go away? or now that you are also making the changes for the regrouping of the QAs to the three CHECK CCDs, CHECK FIBERS, CHECK SPECTRA, my changes here are not helping. Plesae let me know and I do as is fit. Thank you for all the efforts.

felipelm commented 6 years ago

@Srheft we prefer to move to the new format (3 stages) and use the mapping I suggest. This way we can make sure we are using the correct keys everywhere.

We will change the files you've changed here pretty soon to accommodate the new structure but thanks for the help

Srheft commented 6 years ago

Yeah that's more efficient. Thank you.

Srheft commented 6 years ago

@sbailey Regarding your concern here:

If QL doesn't work with this version of $DESI_CCD_CALIBRATION_DATA=/software/ql/ccd_calibration_data, then please open a separate ticket ...

See if this verification [done few minutes ago] on desi-1 is satisfactory:

[quicklook@desi-1 run_quicklook]$ echo $DESI_CCD_CALIBRATION_DATA /software/ql/ccd_calibration_data

[quicklook@desi-1 run_quicklook]$ desi_quicklook -i ./config_files/qlconfig_darksurvey.yaml -n 20191001 -c b0 -e 3578

The screen log for one cam is provided below for your view [smooth run is verified for all 30 cams of science 3578, arc 3571, flat 3574]: desi-1-QL-screenlog_09302018.txt

Let me know if I could add any further evidence on QL performance that would eliminate any remaining concerns.

sbailey commented 6 years ago

@Srheft this run of QL by itself appears ok. However, two remaining concerns:

i.e. QL working by itself is a necessary but not sufficient condition. We also need QL as run by QLF to work and produce output files that are understandable and useable by QLF to display the results, without anybody changing the /software/ql/ccd_calibration_data/SIM/ files. When that is working on http://desi-1.kpno.noao.edu:9000 please let me know.

Note: Operations review is this week, with associated distractions.

Srheft commented 6 years ago

@sbailey

and the run ended at: 2018-09-30 21:11:19,751 QuickLook INFO : Pipeline completed. Final result is in /n/home/quicklook/sarahE/desi/spectro/redux/QL/exposures/20191001/00003578/sframe-b0-00003578.fits

So the run happened on Sep 30th for b0 but the fiberflat is altered[copied] last on Sep 27th. Does this sound reassuring now?


On why there is no working version of QL+QLF yet. Please read the two messages above the one to which you replied. QLF folks have decided to migrate to three-step set up while also apply the key name changes.

@felipelm Shall I continue to edit qlf mapping codes in this PR to make the colouring issues in the steps 1 and 4 to go away? or now that you are also making the changes for the regrouping of the QAs to the three CHECK CCDs, CHECK FIBERS, CHECK SPECTRA, my changes here are not helping. Plesae let me know and I do as is fit. Thank you for all the efforts.

@Srheft we prefer to move to the new format (3 stages) and use the mapping I suggest. This way we can make sure we are using the correct keys everywhere.

I thought the presentation for which the screenshots were needed was going to happen during the upcoming meeting in Barcelona so I assumed there will be time. I'll followup with @felipelm on latest developments.

felipelm commented 6 years ago

@Srheft can we close this PR?