Closed Theodlz closed 1 year ago
@mcoughlin I believe this is what you had in mind right?
Ah and also right now it triggers on push, just to make it easier to develop. Once we're happy with it, we might wanna change to only on main
@Theodlz exactly right. Maybe @tylerbarna can take a look?
Hi @Theodlz, this looks very useful! Is there a way I can test this locally despite the limitations of running on a local SkyPortal instance?
I suggest running the API as is, not in docker. You can use the conda requirements file to create an env with conda, or I can send you a classic requirements.txt if you prefer virtualenv.
Thanks, I'll try it with conda.
@Theodlz, I have the API service running locally, and I'm able to make GET requests to health
and analysis/nmma_analysis
and obtain the expected response. I'm not quite familiar enough with SkyPortal analysis services to test the POST request for analysis/nmma_analysis
using my local SkyPortal. Does the demo data include any sources for which this analysis can be run?
Hi Brian,
Let me send you what you can add to the db_demo.yaml to have NMMA added to your skyportal
@bfhealy you can put this:
- name: "NMMA_Analysis"
display_name: "NMMA analysis"
description: "Use NMMA to fit fast transient light curves"
version: "1.0"
contact_name: "Michael Coughlin"
url: "http://localhost:6901/analysis/nmma_analysis"
authentication_type: "header_token"
_authinfo: '{"header_token": {"Authorization": "Bearer MY_TOKEN"}}'
analysis_type: "lightcurve_fitting"
input_data_types: ["photometry", "redshift"]
optional_analysis_parameters: '{"source": ["Me2017", "Piro2021", "nugent-hyper", "TrPi2018"], "fix_z": ["True", "False"]}'
group_ids:
- =program_A
- =program_B
under the analysis_services:
key in db_demo.yaml
, and run make load_demo_data
with your skyportal already up and running.
Thanks, I'll give this a try!
@Theodlz @bfhealy instead of Dynesty, can we use pymultinest? I wonder if we need to pin a certain version of Dynesty, do you know @tsunhopang?
@mcoughlin yes the version of dynesty has to align with the version of bilby used. For the current nmma, we should be using dynesty>=2.0.0
@bfhealy @Theodlz maybe spend a minute debugging, but I think the default should be pymultinest.
I'm getting a different error when I change the sampler to pymultinest:
[11:14:12 nmma] Traceback (most recent call last):
File "/Users/bhealy/nmma/api/app.py", line 197, in run_nmma_model
main(args=args)
File "/Users/bhealy/nmma/nmma/em/analysis.py", line 608, in main
result = bilby.run_sampler(
File "/Users/bhealy/miniforge3/envs/nmma_api2/lib/python3.9/site-packages/bilby/core/sampler/__init__.py", line 234, in run_sampler
result = sampler.run_sampler()
File "/Users/bhealy/miniforge3/envs/nmma_api2/lib/python3.9/site-packages/bilby/core/sampler/base_sampler.py", line 96, in wrapped
output = method(self, *args, **kwargs)
File "/Users/bhealy/miniforge3/envs/nmma_api2/lib/python3.9/site-packages/bilby/core/sampler/pymultinest.py", line 156, in run_sampler
out = pymultinest.solve(
File "/Users/bhealy/miniforge3/envs/nmma_api2/lib/python3.9/site-packages/pymultinest/solve.py", line 71, in solve
run(**kwargs)
File "/Users/bhealy/miniforge3/envs/nmma_api2/lib/python3.9/site-packages/pymultinest/run.py", line 237, in run
prev_handler = signal.signal(signal.SIGINT, interrupt_handler)
File "/Users/bhealy/miniforge3/envs/nmma_api2/lib/python3.9/signal.py", line 56, in signal
handler = _signal.signal(_enum_to_int(signalnum), _enum_to_int(handler))
ValueError: signal only works in main thread of the main interpreter
@bfhealy @Theodlz ah I thought they had fixed that. It might be worth raising an issue on bilby to check. Can you reproduce the Dynesty error just running nmma as usual?
@mcoughlin @Theodlz Yes, I'm able to reproduce the same Dynesty error with a generic light_curve_analysis
call.
@bfhealy sounds like it would be good to open an issue then. This is with the latest Dynesty version? Any requirements we aren't meeting of theirs?
@mcoughlin Yep, latest Dynesty version and all its requirements met. I'm wondering if this might be a bilby issue given the full error output below:
Traceback (most recent call last):
File "/Users/bhealy/miniforge3/envs/nmma_api2/bin/light_curve_analysis", line 33, in <module>
sys.exit(load_entry_point('nmma==0.0.8', 'console_scripts', 'light_curve_analysis')())
File "/Users/bhealy/nmma/nmma/em/analysis.py", line 608, in main
result = bilby.run_sampler(
File "/Users/bhealy/miniforge3/envs/nmma_api2/lib/python3.9/site-packages/bilby/core/sampler/__init__.py", line 190, in run_sampler
sampler = sampler_class(
File "/Users/bhealy/miniforge3/envs/nmma_api2/lib/python3.9/site-packages/bilby/core/sampler/dynesty.py", line 234, in __init__
int(check_point_delta_t / self._log_likelihood_eval_time / 10), 10
AttributeError: 'Dynesty' object has no attribute '_log_likelihood_eval_time'
@bfhealy Can you open this on bilby then? Also check in with them on pymultinest?
@mcoughlin Will do!
@mcoughlin @Theodlz After checking in with bilby and merging the latest nmma changes, I've gotten past the errors above. This did require that I install pymultinest from the source rather than via pip. I now receive the following errors when running NMMA Analysis using my local skyportal:
If the sampler is pymultinest:
Traceback (most recent call last):
File "/Users/bhealy/nmma/api/app.py", line 197, in run_nmma_model
main(args=args)
File "/Users/bhealy/nmma/nmma/em/analysis.py", line 673, in main
result = bilby.run_sampler(
File "/Users/bhealy/miniforge3/envs/nmma_api2/lib/python3.9/site-packages/bilby/core/sampler/__init__.py", line 234, in run_sampler
result = sampler.run_sampler()
File "/Users/bhealy/miniforge3/envs/nmma_api2/lib/python3.9/site-packages/bilby/core/sampler/base_sampler.py", line 96, in wrapped
output = method(self, *args, **kwargs)
File "/Users/bhealy/miniforge3/envs/nmma_api2/lib/python3.9/site-packages/bilby/core/sampler/pymultinest.py", line 178, in run_sampler
self.result.nested_samples = self._nested_samples
File "/Users/bhealy/miniforge3/envs/nmma_api2/lib/python3.9/site-packages/bilby/core/sampler/pymultinest.py", line 201, in _nested_samples
np.vstack([dead_points, live_points]).copy(),
File "<__array_function__ internals>", line 180, in vstack
File "/Users/bhealy/miniforge3/envs/nmma_api2/lib/python3.9/site-packages/numpy/core/shape_base.py", line 282, in vstack
return _nx.concatenate(arrs, 0)
File "<__array_function__ internals>", line 180, in concatenate
ValueError: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 0 and the array at index 1 has size 10
For dynesty:
Traceback (most recent call last):
File "/Users/bhealy/nmma/api/app.py", line 197, in run_nmma_model
main(args=args)
File "/Users/bhealy/nmma/nmma/em/analysis.py", line 673, in main
result = bilby.run_sampler(
File "/Users/bhealy/miniforge3/envs/nmma_api2/lib/python3.9/site-packages/bilby/core/sampler/__init__.py", line 234, in run_sampler
result = sampler.run_sampler()
File "/Users/bhealy/miniforge3/envs/nmma_api2/lib/python3.9/site-packages/bilby/core/sampler/base_sampler.py", line 96, in wrapped
output = method(self, *args, **kwargs)
File "/Users/bhealy/miniforge3/envs/nmma_api2/lib/python3.9/site-packages/bilby/core/sampler/dynesty.py", line 517, in run_sampler
out = self._run_external_sampler_with_checkpointing()
File "/Users/bhealy/miniforge3/envs/nmma_api2/lib/python3.9/site-packages/bilby/core/sampler/dynesty.py", line 652, in _run_external_sampler_with_checkpointing
self.sampler.run_nested(**sampler_kwargs)
File "/Users/bhealy/miniforge3/envs/nmma_api2/lib/python3.9/site-packages/dynesty/sampler.py", line 1044, in run_nested
for i, results in enumerate(self.add_live_points()):
File "/Users/bhealy/miniforge3/envs/nmma_api2/lib/python3.9/site-packages/dynesty/sampler.py", line 455, in add_live_points
raise ValueError("The remaining live points have already "
ValueError: The remaining live points have already been added to the list of samples!
Neither of these errors are thrown if I run nmma without the API service.
@bfhealy and the inputs are otherwise identical? It's just running main from the api rather from the executable?
@bfhealy this API service was first designed a while back (6 months ago maybe?), but still worked in January. It is possible that some changes made to nmma made this broken. Ill have another look this afternoon.
Thanks @Theodlz - @mcoughlin, for the executable I'm currently using an injection, while the API is running on a light curve from the skyportal demo data. I found additional log output from the API call that may be useful:
Starting MultiNest
generating live points
live points generated, starting sampling
Acceptance Rate: 1.000000
Replacements: 32
Total Samples: 32
Nested Sampling ln(Z): **************
16:10 bilby INFO : Overwriting /var/folders/8_/ky643qs168ngjmhrpwcq1fdm0000gn/T/tmpr36n5iik/pm_ZTF21aaqjmps_nugent-hyper/ with /var/folders/8_/ky643qs168ngjmhrpwcq1fdm0000gn/T/tmpf220hkcy/
ln(ev)= 2.2898349882893854E-016 +/- NaN
Total Likelihood Evaluations: 32
Sampling finished. Exiting MultiNest
analysing data from /var/folders/8_/ky643qs168ngjmhrpwcq1fdm0000gn/T/tmpf220hkcy/.txt
16:10 bilby INFO : Overwriting /var/folders/8_/ky643qs168ngjmhrpwcq1fdm0000gn/T/tmpr36n5iik/pm_ZTF21aaqjmps_nugent-hyper/ with /var/folders/8_/ky643qs168ngjmhrpwcq1fdm0000gn/T/tmpf220hkcy/
/Users/bhealy/miniforge3/envs/nmma_api2/lib/python3.9/site-packages/bilby/core/sampler/pymultinest.py:193: UserWarning: genfromtxt: Empty input file: "/var/folders/8_/ky643qs168ngjmhrpwcq1fdm0000gn/T/tmpr36n5iik/pm_ZTF21aaqjmps_nugent-hyper//ev.dat"
dead_points = np.genfromtxt(dir_ + "/ev.dat")
2023-06-20 16:10:00 nmma: Exception while running the model: all the input array dimensions for the concatenation axis must match exactly, but along dimension 1, the array at index 0 has size 0 and the array at index 1 has size 7
@bfhealy Looks to me like it never really sampled anything. Could we try upping the live points and lowering the evidence criteria? I suspect we made the sampling parameters too aggressive to actually function properly in that script.
@mcoughlin You're right - when I tried a different demo source with more light curve points, the sampling ran and the process completed successfully. With the original source I increased the live points and adjusted the evidence tolerance but still get the same error. Perhaps it's because that light curve has upper limits mixed in with the detections?
@bfhealy could be... but the code should still sample all the same. Feels like a big in NMMA if we struggle that much with limits.
@bfhealy sorry for jumping in but what kind of event/injection are being analyzed here? And what live points number and evidence tolerance have you tested?
As the output u just show is showing the posterior is the basically the prior (the ln(ev) ~ 0), that could also be the injected light curves is fully below the detection limit.
More accuratly, all the log-likelihood values are zero.
Although it is the case that we should have some kind of catch rather than throwing the existing exception.
@mcoughlin @tsunhopang I was able to get the analysis to run by deleting the upper limit points from the photometry. The source is ZTF21aaqjmps in the skyportal demo data (photometry below) - it's an SN II, but I thought it was the most relevant object in the demo data to test with the API.
@bfhealy @tsunhopang can you debug what goes wrong in the presence of limits? We don't want a situation where we struggle with those...
@bfhealy could you share the both commandsyou used that run and the one failed?
@tsunhopang Both commands were run via a call to nmma.em.analysis.main
in this PR's app.py
code. This call was initiated using SkyPortal by setting up an NMMA analysis service following Theo's directions above: https://github.com/nuclear-multimessenger-astronomy/nmma/pull/99#issuecomment-1591495322
In app.py
, I did modify some parameters such that nlive = 512
, interpolation_type = 'tensorflow'
and sampler = 'pymultinest'
.
@bfhealy and which light curve model run is having trouble? (I assume that all the model within the list source
are being used one by one). Moreover, I see that trigger time is set to the minimum value of the data timestamps with a tmin of 0.01 and tmax of 7, given the light curve you just show, I think that would leave us with only a handful of data points?
@tsunhopang I was getting the same error for each model in the list. You're right about the 7-day baseline limiting the number of points. I did some more experimenting with the original photometry (plotted below) and found that:
t0
using the first photometric point (even if it's a limit), rather than the first detectionSo it looks like there is not a larger issue with photometric limits, but t0
might be better defined as the time of the first detection.
@bfhealy Yeah making that change sounds reasonable. And we should make the API configurable to change the start and end times.
@mcoughlin @Theodlz I pushed a commit that sets nlive = 512
, interpolation_type=tensorflow
, and sampler=pymultinest
. The commit also defines t0
corresponding to the first detection and allows tmin
, tmax
, and dt
to changed by the user. This will require the following update to the skyportal NMMA_Analysis config:
optional_analysis_parameters: '{"source": ["Me2017", "Piro2021", "nugent-hyper", "TrPi2018", "Bu2022Ye"], "fix_z": ["True", "False"], "tmin": {"type": "number", "default": 0.01}, "tmax": {"type": "number", "default": 7}, "dt": {"type": "number", "default": 0.1}}'
Closing, replaced by #145
This PR adds a first version of a basic NMMA API based service, which would be built and pushed to DockerHub (and potentially connected to a CI later on for deployment) as part of NMMA's Github Actions.
This built on top of https://github.com/Theodlz/nmma-standalone-api-service, which was coded few months back. Some thing need to be fixed because we have a working version:
'Dynesty' object has no attribute '_log_likelihood_eval_time'
)Otherwise, building the image and pushing it to Dockerhub works. This GitHub action will build images for both
amd64
andarm64/v8
(mac silicon) using QEMU. By the way, this is the reason why it takes longer to build then locally, as github needs to emulate the arm64 machine.