pietern / goestools

Tools to work with signals and files from GOES satellites
https://pietern.github.io/goestools/
BSD 2-Clause "Simplified" License
374 stars 83 forks source link

GOES nws directory / filenames are incorrect #100

Closed KiwiInNZ closed 2 years ago

KiwiInNZ commented 4 years ago

Capturing data which spanned UTC 06-NOV-2020 - 07-NOV-2020, filenames and directories for GOES 16 / 17 / Himawari all looked perfectly valid, however the nws directories / filenames did not.

Two directories were created:

drwxr-xr-x 2 pi pi 4096 Nov  7 22:00 1970-01-01
drwxr-xr-x 2 pi pi 4096 Nov  7 14:00 2020-03-10

The first looks to be aligned with the start of Unix time and the second is from over 6 months ago (prior to starting to capture from GOES 17). So there appears to be two issues here:

Linked to this is that the filenames in each directory appear to be invalid, potentially as a result of the directory name issues.

Example filenames from the 1970-01-01 directory, with the same pattern seen for the other directory:

-rw-r--r-- 1 pi pi 44100 Nov  7 17:00 19700101T000000Z_20201107040005-pacsfc24_latestBW.gif
-rw-r--r-- 1 pi pi 55748 Nov  7 17:00 19700101T000000Z_20201107040045-pac24_latestBW.gif

The first part looks to be linked to midnight UTC for the date used for the directory filename. I'm not sure if this should be part of the filename since the actual date / time is the next part of the filename. Can this first part be removed, along with the "_" separator"? In which case the Z time zone indicator should probably be moved to the date / time in the second field.

Boosted09Foci commented 3 years ago

I'm seeing the same issue. Has an update to the script been made?

pietern commented 2 years ago

The code that tries to extract a timestamp from the NWS filename is located here: https://github.com/pietern/goestools/blob/9ca85c81784bc3b1f7e5ff409c4133eb16a3116a/src/goesproc/handler_nws_image.cc#L94-L102

Per the comment I wrote, I assume that for the HRIT stream we end up executing parseIrregularTime: https://github.com/pietern/goestools/blob/9ca85c81784bc3b1f7e5ff409c4133eb16a3116a/src/goesproc/handler_nws_image.cc#L9-L71

The comment at the top here describes the file format that was used at the time I wrote this code. This format consists of the year, the month, and then the day of the year, instead of the day of the month.

@KiwiInNZ @Boosted09Foci @Mopar44084 @creinemann Could you share what the current state of things is for the NWS images that get written? The second part of the filename is what is received. The first part of the filename is what is parsed. If the second part now uses the day of month instead of day of year then the code needs to be updated to reflect this.

If you have a couple examples from February and March we can infer the pattern that is used today.

Thanks in advance.

KiwiInNZ commented 2 years ago

I'm "fixing" the filenames automatically using https://github.com/wxcapture/wxcapture/blob/master/goes-code/wxcapture/process/find_files.py (lines 804-831) (so you can see exactly what I'm doing to fix the files).

These are downloaded from GOES 17. It would be good to get some GOES 16 ones too.

What I've done is to turn off this code (commented out line 1016) so I can see exactly what the current files are without my automated fixes.

I'll post here again later, but I expect that nothing has changed since my original logging of this issue as my fix code is still working as intended.

KiwiInNZ commented 2 years ago

After a few hours, the following files have been created:

These were all created in the nws/2022-01-16 directory, with no sign of a 1970-01-01 directory being created (yet?).

This is using the following version:

goesproc -- 543ece3 (Tue, 5 May 2020 10:42:24 +0200)

Part of goestools (https://github.com/pietern/goestools)
Written by Pieter Noordhuis and contributors

Let me know if you need any additional info. I'll keep the non-correction code commented out a bit more to see if that 1970 directory gets created again.

Shout out with any more questions.

KiwiInNZ commented 2 years ago

files.txt 811 example filenames from today, with no examples of any "from" 1970, i.e. Unix epoch time start.

pietern commented 2 years ago

Thanks for the information, @KiwiInNZ.

I suspect that 1970-01-01 will show up again in April (see the branch in the logic for month >= 4.

I made a fix and verified it works for files that came down today. Since the pattern appears to be fixed length in all cases (the November 2020 examples and the more recent ones), I expect the fix to be conclusive.

If you update your receiver everything should work out of the box. No need to change config files.

pietern commented 2 years ago

Many thanks to @creinemann for sharing his receiver over Tailscale so I could verify the fix.