cositools / cosipy

The COSI high-level data analysis tools
Apache License 2.0
3 stars 16 forks source link

Add spectral fitting notebook for 511 line #93

Closed eneights closed 7 months ago

ckarwin commented 7 months ago

Great job on this, @eneights! I was able to successfully run everything. Below are some comments to address before merging your PR.

Can you please briefly specify in the notebook which mass model is being used (I guess the Compton sphere), and what the data files are. Most people might not know what these are.

The detector response files should be publicly accessible. They are currently only available on the google drive, which is only accessible for members of the COSI collaboration. Likewise, example notebooks should be easily accessible to everyone. Where will the response files be stored for DC2 (this is probably a question for Andreas)?

This example notebook should be in its own directory: docs/tutorials/spectral_fits/line_fit This directory should contain everything needed to run the example, except for files that are too large to store. This is related to the issue that I open here: https://github.com/cositools/cosipy/issues/81#issue-1922321607 Also, the data files should be publicly available. For now, I think this means putting them on wasabi, and providing the link (but again we’ll need to double check with Andreas). Can you also move your other spectral fit example to its own directory: docs/tutorials/spectral_fits/continuum_fit

Why is the orientation file and yaml files in cosipy/cosipy/test_data? The test_data directory should only be for data that is used for testing the code. It should not have data for example notebooks. Please move everything to the directory for this specific example. @israelmcmc Do you agree, or what is your vision on how this should be organized?

“and, for extended sources, shape…” —> and, for extended sources, morphology…

“where $B*b_i$ are the estimated counts due to background in each bin…” I think we need to specify here that ‘i’ is in index for energy.

“Finally, we will fit a Band function:” I’m confused why this is here. Are you really fitting a band function in this example? I thought the example fits a constant source and a Dirac Delta function.

Everything is currently defined relative to test_data.path, including the binned data, which actually isn’t there. It’s better to define everything relative to the specific example directory. This can be easily accomplished as follows:

import os wd = os.getcwd() # working directory ori = SpacecraftFile.parse_from_file(os.path.join(wd,”20280301_first_2hrs.ori"))

You can do the same for the yaml files and the binned data. In this case the user just needs to copy the needed files to the example directory. Even better, we should eventually include a line in the notebook that will automatically copy the needed files from wasabi to the users directory.

On my system I need to add the following line in order for the plots to be shown: %matplotlib inline This is probably going to be dependent on the user’s system, but it might be good to note that somewhere. This can be included as an option with the imports at the top of the notebook.

Bin data from .tra files: I don’t think this needs to be here. It would be better to just refer to the dataIO example for this.

israelmcmc commented 7 months ago

I agree @ckarwin about the organization of the files. It's not @eneights though, we just haven't been careful about this.

Ideally, the files in test_data should be light-weighted, just what is necessary for the unit tests (most of which we still need to write). Some of these files might be good enough for the examples, so it's fine to use them for that purpose as well. However, if we need bigger files, we should put them on wasabi and provide the link to them. The filenames in the notebooks can be with respect to an arbitrary root folder where the users downloaded the files.

Ideally this should be done programatically. We can make awscli part of the dependencies so the notebooks download the necessary files and setup everything for the user. But that's for later, for now the links to the files is fine.

ckarwin commented 7 months ago

Sounds good, @israelmcmc. Thanks for the feedback. Indeed, it's not @eneights, I was just meaning in general, but it's good to fix it early.

eneights commented 7 months ago

Thanks @ckarwin and @israelmcmc! I think I implemented all of the changes, but let me know if I missed anything.

ckarwin commented 7 months ago

Hi Eliza, Thanks for making the changes. There are actually still some comments which haven't been addressed yet.

General users don't have access to the google drive, and so they won't be able to easily run this example without making a big effort to track down the needed files. I just uploaded all the needed data files to wasabi. Please add the following to the notebook:

All needed data files are available on wasabi in cosi-pipeline-public/ComptonSphere/mini-DC2. They can easily be downloaded by running: python get_data.py (which I sent on slack)

Or if you want, you can probably just put the code in get_data.py directly in your notebook. That will make things very convenient.

Please copy the needed yaml files, ori files, and get_data.py (unless putting in notebook) directly to tutorials/spectral_fits. I actually think it's better to make separate sub-directories for SpectralFit and SpectralFit_Line: docs/tutorials/spectral_fits/line_fit docs/tutorials/spectral_fits/continuum_fit This way each respective directory has all the associated files, and it's very clear what belongs to what.

Once all these files are in place, it will make running the example very convenient and user-friendly.

"The goal is to find the values of X = ..." double check your parameters here. I think you still have the ones for the Band function, which have been removed.

Before you commit the notebook you should collapse the output cell after like.fit(). It's super long!

ckarwin commented 7 months ago

Also, I already tested the get_data.py script and it's working fine for me. If you end up adding this directly in the notebook, it would be good to test. I think it should be ok, but you never know.

ckarwin commented 7 months ago

For reference, here is the content of get_data.py:

import os
​
file_list = ['bkg_binned_data_full_511.hdf5','grb_binned_data_511.hdf5','grb_bkg_binned_data_511.hdf5','Isotropic.511keV.p.binnedimaging.imagingresponse_nside16.area.h5']

for each in file_list:
    os.system("AWS_ACCESS_KEY_ID=GBAL6XATQZNRV3GFH9Y4 AWS_SECRET_ACCESS_KEY=GToOczY5hGX3sketNO2fUwiq4DJoewzIgvTCHoOv aws s3api get-object  --bucket cosi-pipeline-public --key ComptonSphere/mini-DC2/%s --endpoint-url=https://s3.us-west-1.wasabisys.com %s" %(each,each))
ckarwin commented 7 months ago

Will also need awscli: pip install awscli @israelmcmc can we add this to the cosipy setup script?

eneights commented 7 months ago

Made those changes! I included the code to download the data from wasabi in the notebook, and it seems to be working. The files needed for the continuum fitting still need to be added to wasabi.

The output cell of like.fit() is already collapsed for me.

ckarwin commented 7 months ago

Great! Thanks for implementing my comments. I'll add the other files tomorrow.

Just a couple more small things, then I'll merge the PR:

Can you move 'import os' from the data download cell to where the other import are?

data_path = Path(" ") --> data_path = Path("your/path") otherwise, it looks like a typo.

ckarwin commented 7 months ago

One more thing, can you go ahead and add 'awscli' to setup.py (https://github.com/cositools/cosipy/blob/main/setup.py), in the list, "install_requires", right after 'threeml'? This will take care of the dependency for copying the files.

eneights commented 7 months ago

@ckarwin Done!