WorldWideTelescope / toasty

Break large images into "tile pyramids", with a focus on the all-sky TOAST format.
https://toasty.readthedocs.io/
MIT License
0 stars 4 forks source link

toasty pipeline candidates are not marked finished when published #49

Closed astrodavid10 closed 3 years ago

astrodavid10 commented 3 years ago

It would be great if toasty pipeline refresh removes candidates that are already published to allow for ease of downloading newly WCS marked images (or just new images).

astrodavid10 commented 3 years ago

@pkgw per our conversation this morning - toasty pipeline refresh also adds the candidates that have already been published

pkgw commented 3 years ago

OK, if I pull down the NOIRLab corpus with toasty pipeline refresh, I get the following summary report:

analyzed 684 candidates from the image source
  - 337 processing candidates saved
  - 0 rejected as definitely unusable
  - 347 were already done
  - 0 were already marked to be ignored

Looking at the code, I've got an obvious oversight: I have code that checks for special ignore-this-image markers, but there's no way for you to actually create them! In my development I set up the markers using manual behind-the-scenes work but I never finished the feature. So there's no way for you to tell toasty "please ignore this image from now on".

So, that's the deal for images that should be permanently rejected. Are there any cases where solvable images with published data are getting re-inserted into the processing queue?

pkgw commented 3 years ago

To address my perma-ignore oversight ... I am envisioning adding an ignore-rejects command that will flag all image IDs found in the rejects folder to be ignored permanently. How does that sound?

I do want the mark-for-ignore to be a process that requires a bit of intervention ... say that a new image shows up without WCS; maybe it could have WCS (such as iotw2035a), or maybe it's a picture of a telescope or something that is never going to. With the above approach, you can bulk-reject most things, but if there's something that might be solvable later, you can prevent it from being ignored by deleting its file out of the rejects directory.

pkgw commented 3 years ago

I did a test run of marking rubin-auxtelspec{1,2,3} for perma-ignore and the implementation seems to be working.

astrodavid10 commented 3 years ago

I suppose this is really a two part desired enhancement.

  1. "toasty pipeline refresh" recognizes the candidates published, but still repopulates the candidates folder with each of those published candidates, so in the NOIRLab case, we go back to 684 total each time we refresh. So preferably it would only populate with the candidates not marked as published or ignore.
  2. I think a manual "ignore-rejects" command makes plenty of sense and I agree that intervention on the user end is desired.
astrodavid10 commented 3 years ago

Accidentally closed, my apologies!

pkgw commented 3 years ago

@astrodavid10 I'm not seeing the first behavior that you describe — if I do a "refresh" with a clean work directory, I only get 337 entries for candidate that haven't already been published.

If you delete the candidates directory altogether and do a refresh, do you still end up with entries for all 684 images? If so, what does the summary report say? The way the system works, if it thinks that an image is already published it really ought to avoid re-adding it as a candidate ... and I have trouble seeing how it might behave as expected for me but not for you. But you never know.

If somehow old candidate files are sticking around (which also shouldn't happen, but I can see more ways for that to go wrong), I could modify the refresh command to make sure to clear out the candidates directory before it runs.

astrodavid10 commented 3 years ago

@pkgw I did a refresh with an empty candidates folder and was successful in only populating 337 candidates!

pkgw commented 3 years ago

Cool. In the near future (hopefully today) I'll make a new release of toasty that adds the ignore-rejects command. The release will also feature a major rework of image coordinate handling that will enable support for tiled FITS files. Those changes shouldn't break anything in your workflow, but please let me know if you run into any problems or see any funny behavior.

astrodavid10 commented 3 years ago

Sounds great to me! Here's an additional 20 images added to the NOIRLab Collection. More soon to come. https://worldwidetelescope.org/webclient/?wtml=https://data1.wwtassets.org/feeds/noirlab/noirlab_draft10.wtml

Have a great weekend,

David

On Fri, Aug 6, 2021 at 9:05 AM Peter Williams @.***> wrote:

Cool. In the near future (hopefully today) I'll make a new release of toasty that adds the ignore-rejects command. The release will also feature a major rework of image coordinate handling that will enable support for tiled FITS files. Those changes shouldn't break anything in your workflow, but please let me know if you run into any problems or see any funny behavior.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/WorldWideTelescope/toasty/issues/49#issuecomment-894284664, or unsubscribe https://github.com/notifications/unsubscribe-auth/APVCBGVTDQLXAJU3TTRHSXLT3PT3FANCNFSM46NITT5Q . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&utm_campaign=notification-email .

pkgw commented 3 years ago

@astrodavid10 OK, I just finished the release process for toasty version 0.7.1, which should add the new ignore-rejects command and make the WCS changes. Let me know how it works! I'm going to close this issue since I think we've addressed the topics initially raised. Please go ahead and open a new one if there are more workflow improvements that you can think of (or bugs in what I've done here).