spacetelescope / dat_pyinthesky

Notebooks for "notebook-driven development" for the Data Analysis Tools efforts
https://dat-pyinthesky.readthedocs.io/en/latest/
8 stars 44 forks source link

Updates to AMI notebooks and dependencies #155

Open dthatte opened 3 years ago

dthatte commented 3 years ago

Please refer to PR #146

review-notebook-app[bot] commented 3 years ago

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

ojustino commented 3 years ago

@dthatte Thanks for updating this branch. Re-adding pre-requirements was almost enough to get the imports working. I also had to add stdatamodels==0.2.4 to requirements.txt.

I don't think I have permission to access MIRAGE_DATA, so I'm unable to run the notebooks all the way through. Could you check that adding stdatamodels==0.2.4 to requirements.txt, creating a fresh conda environment, and following the installation process doesn't affect the notebooks' results?

dthatte commented 3 years ago

@ojustino was there any error when you did not have stdatamodels? If you type 'pip freeze' after installing pre-requirements.txt and requirements.txt do you see stdatamodels? My conda setup using requirements.txt file already has stdatamodels (0.3.0) and I don't have to install it separately.

Mirage data can also be installed following the instructions on https://mirage-data-simulator.readthedocs.io/en/latest/reference_files.html however it's good to have access to it on the central storage. Maybe you need permissions from someone?

ojustino commented 3 years ago

Hi @dthatte, I understand that stdatamodels should already be there. I'm saying that I get an error when trying to import packages in these notebooks that's fixed for me by changing to version 0.2.4. Are you able to run the import cells without errors?

The Mirage files are a 100+ GB download, and the notebook has already been verified to run all the way through, I'm just asking for confirmation that following the steps from my last comment won't cause any new errors.

ojustino commented 3 years ago

I've pasted the error below. (Although this traceback is for Python 3.9, I've encountered the same roadblock with other versions, too.)

---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
/var/folders/ls/7lh0_8jn0ndbm6g7plbzw8n400028t/T/ipykernel_4569/2837860423.py in <module>
     12 import urllib.request
     13 
---> 14 from jwst.pipeline import Detector1Pipeline, Image2Pipeline
     15 
     16 from nrm_analysis.misctools import utils

~/opt/anaconda3/envs/datpy-niriss-ami9/lib/python3.9/site-packages/jwst/pipeline/__init__.py in <module>
----> 1 from .calwebb_ami3 import Ami3Pipeline
      2 from .calwebb_coron3 import Coron3Pipeline
      3 from .calwebb_dark import DarkPipeline
      4 from .calwebb_detector1 import Detector1Pipeline
      5 from .calwebb_guider import GuiderPipeline

~/opt/anaconda3/envs/datpy-niriss-ami9/lib/python3.9/site-packages/jwst/pipeline/calwebb_ami3.py in <module>
      3 import os.path as op
      4 
----> 5 from ..stpipe import Pipeline
      6 
      7 # step imports

~/opt/anaconda3/envs/datpy-niriss-ami9/lib/python3.9/site-packages/jwst/stpipe/__init__.py in <module>
----> 1 from .core import JwstStep as Step, JwstPipeline as Pipeline
      2 
      3 
      4 __all__ = ['Step', 'Pipeline']

~/opt/anaconda3/envs/datpy-niriss-ami9/lib/python3.9/site-packages/jwst/stpipe/core.py in <module>
      6 
      7 from .. import __version_commit__, __version__
----> 8 from .. import datamodels
      9 
     10 from stpipe import crds_client

~/opt/anaconda3/envs/datpy-niriss-ami9/lib/python3.9/site-packages/jwst/datamodels/__init__.py in <module>
      1 from astropy.io import registry
      2 
----> 3 from stdatamodels import ndmodel
      4 
      5 from .model_base import JwstDataModel, DataModel

ImportError: cannot import name 'ndmodel' from 'stdatamodels'
dthatte commented 3 years ago

I can run both notebooks after adding stdatamodels==0.2.4 to the requirements.txt file.

ojustino commented 3 years ago

That's good. If you don't have any objections, would you mind sending a commit with your revised version of requirements.txt?

ojustino commented 3 years ago

Hello @dthatte, I was able to access Mirage's data files on Central Storage but am running into two issues.

I get the following in the "Update the contents of yaml files and generate raw data" section of the first notebook.

FileNotFoundError: [Errno 2] No such file or directory: '.../crds_cache/mappings/jwst/jwst_0771.pmap'

Then, in both notebooks, stpipe claims I'm using version 4.1.0 of jsonschema when my conda environment installed version 3.2.0. This matters because jsonschema's version needs to <4 and >=3.0.2 to fit in with the other requirements.

Since you reported no issues with running either notebook, I'm trying to see if these are problems with my setup.

dthatte commented 3 years ago

I have jsonschema==3.2.0 in the environment that I created for the notebooks. Can you please try adding the following line to the notebook and see if it works? os.environ['CRDS_CONTEXT'] = 'jwst_0682.pmap'

ojustino commented 3 years ago

I run into the same errors, except now the FileNotFoundError refers to .../crds_cache/mappings/jwst/jwst_0682.pmap.

dthatte commented 3 years ago

Do you still have the following two lines in the notebook? Are you connected to VPN? export CRDS_PATH=$HOME/crds_cache export CRDS_SERVER_URL=https://jwst-crds.stsci.edu

Edit: I meant these two lines. os.environ['CRDS_PATH'] = '$HOME/crds_cache' os.environ["CDRS_SERVER_URL"] = "https://jwst-cdrs.stsci.edu"

ojustino commented 3 years ago

Yes to all.

dthatte commented 3 years ago

Copying Slack conversation: @dthatte 10:18 AM What do you see if you do 'cd crds_cache' from your home directory?

@ojustino 11:29 AM Hello Deepashri, I have a folder called mappings when I type ls ~/crds_cache in the terminal, but this is only the case after a local modification I made to your notebook.

11:29 You currently have os.environ['CRDS_PATH'] = '$HOME/crds_cache', which, for me, is interpreted literally and creates a subdirectory named $HOME inside the notebook's current directory. This could be dangerous because if you try rm $HOME (please don't) in terminal, it may delete everything on your laptop.

11:29 To place crds_cache in the actual home directory, I put os.environ['CRDS_PATH'] = os.path.expandvars('$HOME/crds_cache') so that $HOME is interpreted as an environment variable. Either way, I don't think this is what is causing the issues I've encountered.

11:31 Also, do you prefer Slack? I was sticking with GitHub so others could see the troubleshooting steps for posterity's sake.

@dthatte 12:20 PM GitHub makes sense. I can copy these messages to GitHub. The notebook has worked for STScI AMI team and some outside collaborators but there could be be something that we haven't encountered. You can try asking about the error on #mirage or #jwst_calpipe_testing or Mirage developer Bryan Hilbert. The CRDS_PATH setup is from https://mirage-data-simulator.readthedocs.io/en/latest/reference_files.html?highlight=crds_cache#crds-environment-variables and https://jwst-pipeline.readthedocs.io/en/latest/index.html?highlight=crds_cache#calibration-references-data-system-crds-setup

ojustino commented 3 years ago

OK, I'll ask in another forum.

The instructions you've linked for CRDS_PATH are not Python-specific, so $HOME may not interpreted the same way. If running notebook 1 creates a subdirectory called $HOME in your niriss_ami_binary directory, that's the undesirable behavior I was talking about.

dthatte commented 3 years ago

Can you please try this? os.environ["CRDS_PATH"] = os.path.join(os.path.expandvars('$HOME'), "crds_cache")

ojustino commented 3 years ago

That gives the same string as my proposal earlier today (os.environ['CRDS_PATH'] = os.path.expandvars('$HOME/crds_cache')) and leads to the same result.

dthatte commented 3 years ago

@ojustino should I push the notebook with the typo corrected? It will also update the line about python version as seen in git diff below.

ojustino commented 3 years ago

Please wait until I give instructions for what to do next.

ojustino commented 3 years ago

Hello @dthatte, I've solved the missing pmap file issue and can run the first notebook in full. I'll make the commit that standardizes environment variable assignment for both notebooks, so don't worry about that for now.

I can run the second notebook until the "Run ImPlaneA" cell, which errors out with the traceback printed below. Have you seen this before?

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
/var/folders/ls/7lh0_8jn0ndbm6g7plbzw8n400028t/T/ipykernel_26987/2904964868.py in <module>
     25     usebp = False
     26 
---> 27     main(fitsimdir=datasuperdir,
     28          oitdir=odir+'Saveoit/',
     29          oifdir=odir+'Saveoif/',

/var/folders/ls/7lh0_8jn0ndbm6g7plbzw8n400028t/T/ipykernel_26987/1277923522.py in main(fitsimdir, oitdir, oifdir, ifn, oversample, mnem, firstfew, verbose, usebp)
     89         print("main: fitsimdir", fitsimdir)
     90 
---> 91     aff, psf_offset_r, psf_offset_ff, fringepistons = raw_observables(fitsfn=ifn, 
     92                                                                    fitsimdir=fitsimdir,
     93                                                                    oitdir=oitdir,

/var/folders/ls/7lh0_8jn0ndbm6g7plbzw8n400028t/T/ipykernel_26987/1277923522.py in raw_observables(fitsfn, fitsimdir, oitdir, oifdir, affine2d, psf_offset_find_rotation, psf_offset_ff, rotsearch_d, set_pistons, oversample, mnem, firstfew, usebp, verbose)
     54         print(fobj[0].header['FILTER'])
     55 
---> 56     niriss = InstrumentData.NIRISS(fobj[0].header['FILTER'],
     57                                    usebp=usebp,
     58                                    firstfew=firstfew, # read_data truncation to only read first few slices...

~/opt/anaconda3/envs/datpy-niriss-ami14/lib/python3.9/site-packages/nrm_analysis/InstrumentData.py in __init__(self, filt, objname, src, chooseholes, affine2d, bandpass, nbadpix, usebp, firstfew, **kwargs)
    122         # Weighted mean wavelength in meters, etc, etc "central wavelength" for the filter:
    123         from scipy.integrate import simps
--> 124         self.lam_c[self.filt] = (self.throughput[:,1] * self.throughput[:,0]).sum() / self.throughput[:,0].sum()
    125         area = simps(self.throughput[:,0], self.throughput[:,1])
    126         ew = area / self.throughput[:,0].max() # equivalent width

TypeError: list indices must be integers or slices, not tuple
dthatte commented 2 years ago

Do you have WEBBPSF_PATH location to point to WebbPSF data files.

Please follow the instructions on https://webbpsf.readthedocs.io/en/latest/installation.html to download WebbPSF data files and create WEBBPSF_PATH location.

Also see #146 (comment)