Closed abhi0395 closed 2 months ago
This looks good, though the fix to trans
is so basic that I don't understand how archetypes ever worked without it. You mention "For some targetids the archetype mode fails..." which implies that for some it succeeds? Do you understand what is different about those other cases that succeeded? e.g. perhaps the successful cases had all of the minima z<2 so it never tried to access the IGM transmission?
I'm concerned that there is some other bookkeeping bug in addition to the one you fixed here, so I'd like to understand why the fix was so simple and what allowed the previously succeeding cases to work.
I agree with you that the problem is not very well understood. I tried to dig into this more and here is what I found. The issue appears in the following part of the function fitz()
trans[k] = T
if (T is None):
#Return value of None means that wavelenght regime
#does not overlap Lyman transmission - continue here
continue
#Vectorize multiplication
binned[k] *= T[:,:,None]
#Use CPU always with one redshift
(chi2, coeff) = calc_zchi2_batch(spectra, binned, weights, flux, wflux, 1, nbasis,
solve_matrices_algorithm=template.solve_matrices_algorithm,
use_gpu=False)
coeff = coeff[0,:]
except ValueError as err:
if zmin<redshifts[0] or redshifts[-1]<zmin:
#- beyond redshift range can be invalid for template
coeff = np.zeros(template.nbasis)
zwarn |= ZW.Z_FITLIMIT
zwarn |= ZW.BAD_MINFIT
else:
#- Unknown problem; re-raise error
raise err
The issue is that in case of ValueError, we do not deal with trans
dictionary at all, which is passed in archetypes.get_best_archetype()
in archetype mode. In case of PCA, it never fails because
(chi2, coeff) = calc_zchi2_batch(spectra, binned, weights, flux, wflux, 1, nbasis,
solve_matrices_algorithm=template.solve_matrices_algorithm,
use_gpu=False)
is never run in the case of ValueError.
for the following failure case:
rrdesi -i /global/cfs/cdirs/desi/spectro/redux/daily/tiles/cumulative/11472/20240116/coadd-0-11472-thru20240116.fits -o redrock_test-39628128854739056.fits -d redrock_39628128854739056.h5 --targetids 39628128854739056 --archetypes /global/homes/a/abhijeet/software/desisoft/new-archetypes/rrarchetype-galaxy.fits
I just printed
print(zz[i-1:i+2], zzchi2[i-1:i+2], zmin, redshifts[0], redshifts[-1])
[0.5109059 0.51105493 0.51120395] [7877.36377944 7870.93398968 7864.50311703] -0.3738767086571399 -0.0050000000000000044 1.6997470838899442
It appears that parabolic fit fails for this chi2 vs redshift and zmin
is negative, therefore a ZWARN is raised and stored in PCA mode without modifying the trans
dictionary. But in archetype, we pass all these redshifts and corresponding trans
dictionary for the final fit and therefore it fails.
This pull request solves two separate issues on the
main
branch.1) For some targetids the archetype mode fails citing
keyerror
in the trans dictionary. 2) The main branch fails if we want to run the archetype method without Legendre polynomials.Example run for case (1) on `main' branch:
rrdesi -i /global/cfs/cdirs/desi/spectro/redux/daily/tiles/cumulative/6786/20240117/coadd-4-6786-thru20240117.fits -o arch_test-39628278872412904.fits -d arch_39628278872412904.h5 --archetypes /global/homes/a/abhijeet/software/desisoft/new-archetypes/rrarchetype-galaxy.fits --targetids 39628278872412904
fails citing following error:
But when run on
archetype_leg0_and_trans_fix
, it succeeds.Example run for case (2):
rrdesi -i /global/cfs/cdirs/desi/spectro/redux/daily/tiles/cumulative/11472/20240116/coadd-0-11472-thru20240116.fits -o /global/cfs/cdirs/desi/users/abhijeet/test_archetype_runs/archetype_leg0_and_trans_fix/redrock_test-39628128854739056.fits -d /global/cfs/cdirs/desi/users/abhijeet/test_archetype_runs/archetype_leg0_and_trans_fix/redrock_39628128854739056.h5 --targetids 39628128854739056 --archetypes /global/homes/a/abhijeet/software/desisoft/new-archetypes/rrarchetype-galaxy.fits
fails on `main' branch:
But succeeds on
archetype_leg0_and_trans_fix
branch.Sanity checks
Comparing results from main and this branch:
- Archetype mode (main):
srun -n 4 -c 4 --gpu-bind=map_gpu:3,2,1,0 rrdesi_mpi --gpu --max-gpuprocs 4 -i rrdesi_mpi -i /global/cfs/cdirs/desi/spectro/redux/daily/tiles/cumulative/11472/20240116/coadd-0-11472-thru20240116.fits -o /global/cfs/cdirs/desi/users/abhijeet/test_archetype_runs/main/arch_redrock-0-11472-thru20240116.fits -d /global/cfs/cdirs/desi/users/abhijeet/test_archetype_runs/main/arch_redrock-0-11472-thru20240116.fits --archetypes /global/homes/a/abhijeet/software/desisoft/new-archetypes/rrarchetype-galaxy.fits
- Archetype mode (this branch):
srun -n 4 -c 4 --gpu-bind=map_gpu:3,2,1,0 rrdesi_mpi --gpu --max-gpuprocs 4 -i rrdesi_mpi -i /global/cfs/cdirs/desi/spectro/redux/daily/tiles/cumulative/11472/20240116/coadd-0-11472-thru20240116.fits -o /global/cfs/cdirs/desi/users/abhijeet/test_archetype_runs/archetype_leg0_and_trans_fix/arch_redrock-0-11472-thru20240116.fits -d /global/cfs/cdirs/desi/users/abhijeet/test_archetype_runs/archetype_leg0_and_trans_fix/arch_redrock-0-11472-thru20240116.fits --archetypes /global/homes/a/abhijeet/software/desisoft/new-archetypes/rrarchetype-galaxy.fits
- Comparison:
- Non archteype mode (main):
srun -n 4 -c 4 --gpu-bind=map_gpu:3,2,1,0 rrdesi_mpi --gpu --max-gpuprocs 4 -i rrdesi_mpi -i /global/cfs/cdirs/desi/spectro/redux/daily/tiles/cumulative/11472/20240116/coadd-0-11472-thru20240116.fits -o /global/cfs/cdirs/desi/users/abhijeet/test_archetype_runs/main/redrock-0-11472-thru20240116.fits -d /global/cfs/cdirs/desi/users/abhijeet/test_archetype_runs/main/redrock-0-11472-thru20240116.fits
- Non archetype mode (this branch):
srun -n 4 -c 4 --gpu-bind=map_gpu:3,2,1,0 rrdesi_mpi --gpu --max-gpuprocs 4 -i rrdesi_mpi -i /global/cfs/cdirs/desi/spectro/redux/daily/tiles/cumulative/11472/20240116/coadd-0-11472-thru20240116.fits -o /global/cfs/cdirs/desi/users/abhijeet/test_archetype_runs/archetype_leg0_and_trans_fix/redrock-0-11472-thru20240116.fits -d /global/cfs/cdirs/desi/users/abhijeet/test_archetype_runs/archetype_leg0_and_trans_fix/redrock-0-11472-thru20240116.h5