OSOceanAcoustics / echodataflow

Orchestrated sonar data processing workflow
https://echodataflow.readthedocs.io/en/latest/
MIT License
4 stars 1 forks source link

Frequency differencing pipeline and Logging PR #57

Closed Sohambutala closed 4 months ago

Sohambutala commented 5 months ago
  1. Stream logging changes for logging via kafka.
  2. Exception handling changes which were causing errors while logging the errors encountered in the files
  3. Contains two new stages apply_mask and frequency_diff
  4. Added log for failed file with error message

@valentina-s @leewujung

valentina-s commented 5 months ago

Thanks, @Sohambutala ! Can you rename this PR to adding frequency differencing pipeline and log fixes. Since adding frequency diff and apply mask are new functionalities and the should be visible.

I tried to run this pipeline through the notebook, and I got the following error message:

Screenshot 2024-04-10 at 1 53 15 PM

Do you think those are aligned? Is this the way you are testing it?

Sohambutala commented 5 months ago

Hi @valentina-s,

I selectively cherry-picked commits to create PRs from the DEV branch, as there was a significant backlog of changes accumulated locally. The stage error occurred because the file_utils, where the function definition was modified, got included in the Transect Group Rename PR.

I have tested the package before release. The package has been set up on the jetstream instance. However, I am currently refining some aspects, specifically to allow for accessing inputs with or without the .raw extension. At the moment, it requires zip files and filenames to follow a particular format, which does not align with the current structure of the transect groups in the bucket.

valentina-s commented 5 months ago

I see, ok, you can merge these updates and after that I can test the full pipeline. Which notebook did you use for testing the version on pypi? Thanks @Sohambutala !