Closed jchiang87 closed 4 years ago
relevant pointer from the record of transfers NERSC <-> IN2P3 : https://github.com/LSSTDESC/ImageProcessingPipelines/issues/86#issuecomment-521841925
this resulted in 257 visits processed, the list being attached
Here's a file with the missing y02 visits: Run2.2i_y02_missing_visits.txt There are 555 visits and the columns in the file are visit number and MJD.
Here is the code that generated this list:
import os
import glob
import sqlite3
import pandas as pd
# Find the visits that were processed
visit_dirs = glob.glob('/global/cfs/cdirs/lsst/production/DC2_ImSim/Run2.2i/desc_dm_drp/v19.0.0-v1/rerun/run2.2i-calexp-v1/calexp/*')
processed_visits = set(int(os.path.basename(_).split('-')[0]) for _ in visit_dirs)
# Extract the y02 visits from the minion_1016 db
with sqlite3.connect('/global/homes/j/jchiang8/scratch/desc/Run2.2i/minion_1016_desc_dithered_v4_trimmed.db') as conn:
df = pd.read_sql('select * from summary', conn)
t0 = min(df['expMJD'])
my_df = df.query(f'{t0 + 365} < expMJD < {t0 + 2*365}')
minion_visits = set(my_df['obsHistID'])
# Find the missing visits
missing_visits = sorted(list(minion_visits.difference(processed_visits)))
# Write a file with the visits and MJDs
expMJDs = dict(zip(my_df['obsHistID'], my_df['expMJD']))
missing_visits = {visit: expMJDs[visit] for visit in missing_visits}
with open('Run2.2i_y02_missing_visits.txt', 'w') as output:
for visit, expMJD in missing_visits.items():
output.write(f'{visit} {expMJD:.2f}\n')
ok my list of visits is contained in yours, Jim. So we need to make sure we understand what is going on with the other ones : the ones downloaded to CC go from 00385844 to 00445379 according to the naming convention in the paths at CC.
all_visits_to_date_031220.txt At the request of Jim, here is a file containing all visits simulated to date!
Based on Antonio's list of simulated visits and on the master list of minion_1016 visits, there are 540 visits missing in total through year 3. However, about half of those visits (specifically all of the y01 and y03 missing ones) are on the periphery of the DC2 300 sq degree region, and 256 would need to be simulated to fill the hole in y02. Here is the summary plot showing the distribution of missing visits as function of MJD, and the centers of the visits (i.e., the pointing directions) for each year. (I've updated the plots of the pointing directions with the boundary of the DC2 300 sq degree region.)
STOP PRESS: Eve recalled that we found a problem with the redshifts in the instance catalogs used for Run2.1.1i (redshift_true
was used instead of redshift
, which includes peculiar velocities), so we would need to re-simulate all of the visits in hole, nominally the 555 I posted above.
Heather and I are pulling Y2 Run2.2 instance catalogs to tape to prep for any necessary resimulation and verify that these visits did recieve these updated instance catalogs. Now that the memory is flowing, I thought we had chosen to resim y2 from scratch because of this issue and I am unsure why we are missing these visits...
Here's a plot of the pointing directions of the 555 y02 "missing" visits: As noted above, the visits in this list are the ones in minion_1016 that are not in registry.sqlite3 for y02. Assuming we will skip the visits outside of the DC2 boundary, there are 513 visits to simulate, which are plotted in red. Here is the list: y02_missing_visits_to_simulate.txt
The Y02 Run2.2i instance catalogs are being copied from HPSS and are appearing here: /global/projecta/projectdirs/lsst/production/DC2_ImSim/Run2.2i/InstCat/copyFromHPSS/y02_191109
I expect the transfer to finish in the next ~4 hours, and then it's just a matter of extracting the tarballs.
Antonio tells me that what I pulled from HPSS and put into /global/projecta/projectdirs/lsst/production/DC2_ImSim/Run2.2i/InstCat/copyFromHPSS/y02_191109
- contains none of the missing visits. This is a bit mystifying, because I copied these files directly from the instance catalog area Scott used to generate the instance catalogs: /global/cscratch1/sd/descim/instcat_y02_191109
However there is another stash of instance catalogs on projecta: /global/projecta/projectdirs/lsst/production/DC2_ImSim/Run2.2i/InstCat/WFD/y02
which is owned by the descim
account, that I have no knowledge of. I don't know where this set of instance catalogs came from and why it seems to be completely distinct from what I thought was the full set.
A couple of things.. we need to verify that the instance catalogs contained in /global/projecta/projectdirs/lsst/production/DC2_ImSim/Run2.2i/InstCat/WFD/y02
should be used to generate the missing visits - are they configured correctly? Who can check this? And secondly - it would be nice to figure out where this stash came from. I'll set out to add this set to HPSS so we have them all stored for posterity.
If there are no objections - I will remove the copy of y02 instance catalogs that I already pulled from HPSS to free up some disk space, as it seems we don't need them.
I hope we did use the new instance catalogs that Joanne generated and not the incorrect once from Scott. I also hope that the instance catalogs we used were not purged by now if they were not moved.
Antonio, can you clarify what you used?
On 3/13/20 11:44 AM, Heather Kelly wrote:
Antonio tells me that what I pulled from HPSS and put into |/global/projecta/projectdirs/lsst/production/DC2_ImSim/Run2.2i/InstCat/copyFromHPSS/y02_191109|
- contains none of the missing visits. This is a bit mystifying, because I copied these files directly from the instance catalog area Scott used to generate the instance catalogs: |/global/cscratch1/sd/desc/HPSS/dc2/run2.2i/instCat/y02| However there is another stash of instance catalogs on projecta: |/global/projecta/projectdirs/lsst/production/DC2_ImSim/Run2.2i/InstCat/WFD/y02| which is owned by the |descim| account, that I have no knowledge of. I don't know where this set of instance catalogs came from and why it seems to be completely distinct from what I thought was the full set.
A couple of things.. we need to verify that the instance catalogs contained in |/global/projecta/projectdirs/lsst/production/DC2_ImSim/Run2.2i/InstCat/WFD/y02| should be used to generate the missing visits - are they configured correctly? Who can check this? And secondly - it would be nice to figure out where this stash came from. I'll set out to add this set to HPSS so we have them all stored for posterity. If there are no objections - I will remove the copy of y02 instance catalogs that I already pulled from HPSS to free up some disk space, as it seems we don't need them.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/LSSTDESC/DC2-production/issues/387#issuecomment-598814322, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADCD3DESYKRTLUCHK7QYXYLRHJPGZANCNFSM4LGOSO7Q.
I think what I stored to HPSS is what Joanne generated.. Though I would like to clarify what area was used for her instance catalog generation - I believe it was in the same area that Scott originally used. These files were copied to tape on Jan 31, 2020 from /global/cscratch1/sd/descim/instcat_y02_191109
Note the corrected path from what I commented above.
The fact that those files retrieved from HPSS do not have the missing visits explains why the images for those visits were not generated in the first place, since the intent for Run2.2i was to generate imsim data for everything. I'd like to have a closer look at the unpacked files myself first, but I expect there is something in the instance catalog generation scripts that skipped the Run2.1.1i visits that was still enabled when Joanne remade the instance catalogs with the corrected redshifts, etc..
I can confirm that we used the instance catalogs that were then stored to tape. That gives us a good explanation as to why we missed these visits at least, since my worklist pulls directly from the list of instance catalogs and then trims based on the sensor region.
As to why we seemingly have the remaining instance catalogs in another folder... that I cannot answer. My only possible guess is that we separated them out, but I do not recall doing this and the folder naming is not consistent with this hypothesis.
As to why we seemingly have the remaining instance catalogs in another folder
Which folder specifically?
Here is the provenance of the instcats in /global/projecta/projectdirs/lsst/production/Run2.2i/InstCat/WFD/y02
:
https://lsstc.slack.com/archives/C77DDKZHR/p1571690314032400
Scott wrote them and they were the ones used to find and diagnose the redshift problem (among other things).
Wait, is that perhaps why these were left out of the remaining instance catalogs? These were final versions and it was just chosen not to redo the work during instance catalog generation?
That seems to be the case in that these were meant to be the final versions, but they were found to have the redshift bug when we made the comparisons with the truth catalogs.
@JoanneBogart has agreed to generate new instance catalogs for the 513 visits that are needed using the list posted above. Once those are available, we can use them to generate the images for those visits.
Sounds like a plan to me, though I thought these would be equivalent to those in /global/projecta/projectdirs/lsst/production/Run2.2i/InstCat/WFD/y02
. Were additional changes needed from that baseline?
The files in that directory used incorrect redshifts as we found by comparing the implied fluxes with the truth catalog values, and there are only 279 of them.
Hrm. I would have sworn I remember looking at it and having 500+ instance catalogs. Still, if they were with incorrect redshifts, we can wait for Joanne to generate the remaining 513 visits worth.
The job is running now. It should be done in a few hours. For the most part it looks ok (254 catalogs have been written successfully so far) but there are one or two problems. I 'll have to make another small run to redo those.
500 catalogs were written without error. The tar files are available at
/global/cscratch/descim/y02_instCat_missing
I'll redo the others and put them in a separate directory.
When you say "the others" you mean the other 13? Just to be sure, we only have to redo the "hole" now, everything else is correct, right? (It's sometimes hard to follow the discussions). Thanks!
Yes. Just the other 13 from Jim's list above.
Those 13 are now done as well. The tar files are in
/global/cscratch/descim/y02_instCat_missing_remaining
With the instance catalogs generated and simulations running, I'm closing this issue. If we need to track anything having to do with sims or image processing, we can open an new issue.
Here are the depth maps for the Y02 data processed for Run2.2i as of today: We have ascertained that there are ~550 visits in the minion_1016 ops db that we had intended to process but which do not appear in the Run2.2i processed calexps.
The missing visits seem to be the ones we generated for Run2.1.1i, which we omitted from Run2.2i to avoid redoing the image sims. We'll use this issue to document what files are missing, and what's available from Run2.1.1i, both in terms of raw files and in processed data.