imcgreer / simqso

Module for generating simulated quasar spectra
BSD 3-Clause "New" or "Revised" License
12 stars 11 forks source link

incremental progress toward making simqso python3 compatible #15

Closed moustakas closed 6 years ago

moustakas commented 7 years ago

Mostly changed print statements to functions (and tabs to spaces in a couple modules). I tested by trying to run the example notebooks but there were some issues I couldn't quite resolve. One major issue is missing files (e.g., apjaa7051t4_mrt.txt in SimpleSpecExample.ipynb).

Another (more important) issue is the next() method around line 959 of sqgrids.QsoSimObjects.getBandIndex().

Another is whether the filter object in QsoSimObjects.getVars can be turned into a list as I'm doing.

Finally, when I ran SimpleSpecExample.ipynb I got a table of QSO properties but no spectra, which I think might have to do with line ~433 in sqrun.buildSpectraBulk.

Thanks!

imcgreer commented 7 years ago

Thanks for putting this together! I guess I will have to relent on the tabs v. spaces issue, so we might as well convert all the files. On the other hand, in cc4a05e2ba3f31003b4f33ec63481e86bdb5c35c and 42a921107ff6889d0e181e9969e7e3270689d392 you join multi-line statements into a single line which is longer than 79 characters and thus not PEP-8 compliant.

The missing file apjaa7051t4_mrt.txt was used to plot a comparison between the published Lyu+Rieke SED and the model reconstruction of it in simqso. I don't want to check that file in as it is a copyrighted AAS table. We could either include a link to download it in the NB and skip it if not present, or use urllib to download it. What do you think?

What is the issue with next()? The description for python3 is the same as python2.

And I see that filter() was changed to return an iterable instead of list in python3. You can force resolution using list() as you have done.

I'll try to see what's going on with the buildSpectraBulk issue.

moustakas commented 7 years ago

Here's what I have so far:

imcgreer commented 7 years ago

Just applied the PR in a working copy and noticed the conversion was to 8 spaces per tab instead of 4.

This patch fixes the issue with buildSpectraBulk -- another case where python3 returns an object instead of a list:

diff --git a/simqso/sqrun.py b/simqso/sqrun.py
index f50f9df..d3d8df5 100644
--- a/simqso/sqrun.py
+++ b/simqso/sqrun.py
@@ -326,7 +326,7 @@ def buildGrpSpectra(wave,cosmo,specFeatures,photoCache,saveS

 def _regroup(spOut):
         # XXX tell me there's a better way to do this
-        n = len(list(spOut)[0])
+        n = len(spOut[0])
         rv = [ [] for i in range(n) ]
         for sp in spOut:
                 for j in range(n):
@@ -426,7 +426,7 @@ def buildSpectraBulk(wave,qsoGrid,procMap=map,
                                          specFeatures,photoCache,iterNum=iterNu
                                          saveSpectra=saveSpectra)
                 print('buildSpectra iteration ',iterNum,' out of ',nIter)
-                specOut = procMap(build_one_spec,qsoGrid)
+                specOut = list(procMap(build_one_spec,qsoGrid))
                 specOut = _regroup(specOut)
                 synMag,synFlux,spectra = specOut
                 v = qsoGrid.getVars(grids.SightlineVar)

Will look into next() next.

moustakas commented 7 years ago

Yeah, I hadn't made the change to 4 spaces yet. I just used EMACS' untabify keyboard shortcut. Let's clean up to ~PEP8 after we get everything running.

imcgreer commented 7 years ago

Here's the fix for next(). It's a better method anyway. This gets quickspeclib working for me.

diff --git a/simqso/sqgrids.py b/simqso/sqgrids.py
index 9564f54..8906da1 100644
--- a/simqso/sqgrids.py
+++ b/simqso/sqgrids.py
@@ -956,7 +956,8 @@ class QsoSimObjects(object):
                 else:
                         return None
         def getBandIndex(self,band):
-            return next(j for j in range(len(self.photoBands)) if self.photoMap['filtName'][self.photoBands[j]]==band) 
+                shortNames = [ self.photoMap['filtName'][b] for b in self.photoBands ]
+                return shortNames.index(band)
         def getObsBandIndex(self):
                 return self.getBandIndex(self.getVars(AppMagVar)[0].obsBand)
         def read(self,gridFile,clean=False):
moustakas commented 7 years ago

I've implemented the proposed changes -- thanks! Two other notebooks are crashing -- in ColorzKcorr.ipynb the file 'forest_grid_2500.fits' is missing.

And bossqsos_example.ipynb fails in cell [3] with the error below.


boss_dr9qlf_sim output not found
generating QSO grid
integration returned  390  objects
... building continuum grid
WARNING: GaussianPLawDistribution continuum is deprecated
simulating  390  quasar spectra
units are  flux
max number iterations:  5

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-3-7c1853bb0b92> in <module>()
----> 1 _ = bossqsos.qsoSimulation(bossqsos.simParams,saveSpectra=True)

~/repos/simqso/simqso/sqrun.py in qsoSimulation(simParams, **kwargs)
    595         _,spectra = buildSpec(wave,qsoGrid,procMap,
    596                               maxIter=simParams.get('maxFeatureIter',5),
--> 597                               verbose=verbose,saveSpectra=saveSpectra)
    598         timerLog('Build Quasar Spectra')
    599         #

~/repos/simqso/simqso/sqrun.py in buildSpectraBySightLine(wave, qsoGrid, procMap, maxIter, verbose, saveSpectra)
    376                         qsoGrid.data[k][objgrp['_ii']] = out[k]
    377         if saveSpectra:
--> 378                 spectra = np.vstack([s['spectra'] for s in specOut])
    379                 spectra = spectra[qsoGroups.parent['_ii'].argsort()]
    380         else:

/usr/local/anaconda3/envs/desi/lib/python3.5/site-packages/numpy/core/shape_base.py in vstack(tup)
    235 
    236     """
--> 237     return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
    238 
    239 def hstack(tup):

ValueError: need at least one array to concatenate
imcgreer commented 6 years ago

Argh, another place. Think this should be it:

diff --git a/simqso/sqrun.py b/simqso/sqrun.py
index d3d8df5..1a4c9d0 100644
--- a/simqso/sqrun.py
+++ b/simqso/sqrun.py
@@ -363,7 +363,7 @@ def buildSpectraBySightLine(wave,qsoGrid,procMap=map,
         # pool.map() doesn't like the iterable produced by table.group_by(), so
         # forcing resolution of the elements here with list() -- not that much
         # memory anyway
-        specOut = procMap(build_grp_spec,list(qsoGroups))
+        specOut = list(procMap(build_grp_spec,list(qsoGroups)))
         if qsoGrid.photoMap:
                 bands = qsoGrid.photoBands
                 def newarr():

For the missing file you can uncomment the line with generate_binned_forest, but beware it will take a while to run.

moustakas commented 6 years ago

@imcgreer Thanks, I added this change and also made some progress on replacing tabs with 4 spaces but there's a lot more to do (and it's not trivial -- I am worried about changing some of the logic if I'm not careful).

At what point do you want to merge this PR? And before you do it might make sense to create a tag of the pre-merged code.

imcgreer commented 6 years ago

I decided to do a global tab->spaces replacement with sed and then a global python conversion with 2to3, in order to make the repo self-consistent. I thought I might be able to merge in the remaining commits from your branch but that proved too difficult, so I added them one-by-one. After a bunch of commits today I'm able to run everything in python3, and I'm going to close this PR without merging. Let me know if you find anything awry. And thanks for your help!