nasa-fornax / fornax-demo-notebooks

Demo notebooks for the Fornax project
https://nasa-fornax.github.io/fornax-demo-notebooks/
BSD 3-Clause "New" or "Revised" License
9 stars 19 forks source link

panstarrs_get_lightcurve using lsdb #344

Closed jkrick closed 1 month ago

jkrick commented 2 months ago

having some trouble catching exceptions for FileNotFound in the call to lsdb's crossmatch with the Panstarrs catalog. It would be nice if we could somehow catch this error so that the crossmatch doesn't fail if a file is missing. Instead, it would be ideal if one of the objects fails the crossmatch, that the crossmatch still finishes with the other objects complete, and the final result just looks like no match was found for that trouble-causing object/file.

Helpdesk request submitted to MAST 9/19 about the missing file.

jkrick commented 2 months ago

MAST helpdesk responded that they will fix the file problem on S3 before Oct 7, so this PR will wait for that fix to move forward, so that testing can be completed.

jkrick commented 1 month ago

Now fully functional! This PR includes:

Notes

This will close #165 This uses help from: https://github.com/astronomy-commons/lsdb/issues/416

bsipocz commented 1 month ago

800s to run the full notebook

It's a bit long, but for fornax it's totally acceptable unless we hit a VM limit. We currently skip this notebook due to the ZTF private bucket access.

(and I don't know why we see the error with the classifier notebook, that should not be affected by this PR)

bsipocz commented 1 month ago

Your comments and experience around working with the dtypes would be a super useful feedback to the lincc team I think.

troyraen commented 1 month ago

LGTM. Thanks!

jkrick commented 1 month ago

Your comments and experience around working with the dtypes would be a super useful feedback to the lincc team I think.

message received, I put this in as feedback to lsdb in issue #441 over at astronomy-commons/lsdb

jkrick commented 1 month ago

@bsipocz when you get a chance, can you please merge this PR? I don't know why it is failing ci.

bsipocz commented 1 month ago

It seems to be something unrelated, but I'm not sure I can dive into the details before I get back home.

jkrick commented 1 month ago

I'm not in a hurry, thanks.

bsipocz commented 1 month ago

OK, my latest commit should fix the failure. Basically adding some dask would trigger sktime to check if dask[dataframe] is all installed, and it wasn't.

bsipocz commented 1 month ago

Hmm, apparently this is in fact an sktime issue: https://github.com/sktime/sktime/issues/7250. Someone reported that downgrading sktime solves the issue, so I was trying that. If it doesn't work, I'll look into more hackery.

bsipocz commented 1 month ago

I'm not sure why the classifier notebook run into VM limits now, I'm doing some more debugs and see if it is still OK with GHA. If yes, then I'll bring over the configs from the IRSA notebooks to we can easily opt in GHA build here, too.

jkrick commented 1 month ago

see if it is still OK with GHA @bsipocz what is "GHA"?

bsipocz commented 1 month ago

GHA: GitHub Actions. We use GHA for doing the HTML rendering and hosting the pages with github. But for the PRs we use a different system, and their limitations differ a bit. It was already close to the limit but from below, I suspect adding the new dependencies may changed it worse the worse a bit, but we can't do much about it.

TL;DR, I may need to tweak the configs but I don't think there is much here to do about it. So I would merge this now and figure out issues with circleCI if/when they arise on main.