Open TAlonglong opened 2 years ago
What's the incoming message triggering this? And the configuration. I have a feeling I've seen something like this before.
config
[DEFAULT]
regions = eurol
[ears-metop-ascat]
regions=ears_high_res
pattern = ascat_{start_time:%Y%m%d_%H%M%S}_{platform_name:s}_{orbit_number:s}_ear_o_coa_{ears_station:s}
format = EPS
type = binary
sensor = ascat
timeliness = 30
level = 0
variant = EARS
publish_topic = /CAT/EARS/METOP-ASCAT/BUFR
service=move_it_metop-ascat-ears-coa
topics=/metop/ascat/ears/coa/l2_bufr
duration=600
publish_message_after_each_reception=True
[ears-metop-ascat-global-057]
regions=ears_high_res
pattern = ascat_{start_time:%Y%m%d_%H%M%S}_{platform_name:s}_{orbit_number:s}_eps_o_057_ovw.l2_bufr
format = EPS
type = binary
sensor = ascat
timeliness = 120
level = 0
variant = EARS
publish_topic = /CAT/GLOBAL/METOP-ASCAT-057/BUFR
service=move_it_metop-ascat-global-057
topics=/metop/ascat/global/057/l2_bufr
duration=600
publish_message_after_each_reception=True
[global-metop-amv-l2d]
regions=amv_collection
pattern = W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,{platform_name:s}+AVHRR_C_EUMP_{start_time:%Y%m%d%H%M%S}_{orbit_number:5s}_eps_o_amv_l2d.bin
format=BUFR
type=binary
data_processing_level=2
sensor=avhrr
#timeliness in minutes
timeliness=120
duration=180
variant=GLOBAL
publish_topic=/GLOBAL/METOP/AMV/2d
service=move_it_global-metop-amv-l2d
topics=/CHECK/GLOBAL/SEGMENTS/METOP/AMV/2d
publish_message_after_each_reception=True
The last message printed before the error:
[DEBUG: 2022-03-21 21:38:39 : pytroll_collectors.triggers._base] Waiting 4583.930227 seconds until timeout
[DEBUG: 2022-03-21 21:38:39 : pytroll_collectors.triggers._base] mda: {'sensor': 'avhrr', 'platform_name': 'Metop-B', 'start_time': datetime.datetime(2022, 3, 21, 20, 52, 3), 'orbit_number': '49329', 'uri': 'ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321205203_49329_eps_o_amv_l2d.bin', 'uid': 'W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321205203_49329_eps_o_amv_l2d.bin', 'origin': '157.249.16.182:9062', 'end_time': datetime.datetime(2022, 3, 21, 20, 55, 3)}
[DEBUG: 2022-03-21 21:38:39 : pytroll_collectors.region_collector] Adding area ID amv_collection to metadata for Metop-B
[INFO: 2022-03-21 21:38:39 : pytroll_collectors.region_collector] Added expected granule Metop-B (2022-03-21 20:52:03) to area amv_collection
[INFO: 2022-03-21 21:38:39 : pytroll_collectors.region_collector] Collection finished for Metop-B area amv_collection
[DEBUG: 2022-03-21 21:38:39 : pytroll_collectors.triggers._base] Composing topic.
[DEBUG: 2022-03-21 21:38:39 : pytroll_collectors.triggers._base] Is last file added: False
[INFO: 2022-03-21 21:38:39 : pytroll_collectors.triggers._base] sending pytroll://GLOBAL/METOP/AMV/2d collection ubuntu@satellite-wind-processing-b 2022-03-21T21:38:39.079863 v1.01 application/json {"sensor": "avhrr", "platform_name": "Metop-B", "start_time": "2022-03-21T20:13:03", "orbit_number": "49329", "origin": "157.249.16.188:9062", "end_time": "2022-03-21T20:55:03", "collection_area_id": "amv_collection", "collection": [{"start_time": "2022-03-21T20:13:03", "end_time": "2022-03-21T20:16:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321201303_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321201303_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:16:03", "end_time": "2022-03-21T20:19:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321201603_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321201603_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:19:03", "end_time": "2022-03-21T20:22:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321201903_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321201903_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:22:03", "end_time": "2022-03-21T20:25:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321202203_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321202203_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:25:03", "end_time": "2022-03-21T20:28:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321202503_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321202503_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:28:03", "end_time": "2022-03-21T20:31:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321202803_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321202803_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:31:03", "end_time": "2022-03-21T20:34:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321203103_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321203103_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:34:03", "end_time": "2022-03-21T20:37:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321203403_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321203403_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:37:03", "end_time": "2022-03-21T20:40:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321203703_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321203703_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:40:03", "end_time": "2022-03-21T20:43:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321204003_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321204003_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:43:03", "end_time": "2022-03-21T20:46:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321204303_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321204303_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:46:03", "end_time": "2022-03-21T20:49:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321204603_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321204603_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:49:03", "end_time": "2022-03-21T20:52:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321204903_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321204903_49329_eps_o_amv_l2d.bin"}, {"start_time": "2022-03-21T20:52:03", "end_time": "2022-03-21T20:55:03", "uri": "ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321205203_49329_eps_o_amv_l2d.bin", "uid": "W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321205203_49329_eps_o_amv_l2d.bin"}]}
It's a bit tricky to follow the log. I'm not sure if this is a mix of two gatherer configs. But it looks like it gets a file, add granule. This completes the gathering, but at the same time it says the last segment is not added. and then sends the message. And then the error comes.
If the log messages are hard to follow, try to follow the code :sweat_smile:
In short, I have, at least yet, no good ideas why pytroll_collectors.region_collector.RegionCollector.granules
is an empty list when it's being accessed on the publishing step.
I think the above incoming message isn't what triggered the crash. The completed collection were published correctly (the area and file naming match). I'd think the crashing one is the previous incoming message, from which the Waiting for ...
and Is last file added
log messages are from.
OK, so here is the log with the message just before that. So it looks like this is not included and then we might have a race condition still handling of one segment when the next one comes in and is added, triggers complete and sends the message. All this when the previous message is still under handeling. Then the message is cleaned and when it tries to do some handeling then it is already removed?
Just a guess
[DEBUG: 2022-03-21 21:38:36 : pytroll_collectors.triggers._base] mda: {'sensor': 'avhrr', 'platform_name': 'Metop-B', 'start_time': datetime.datetime(2022, 3, 21, 20, 55, 3), 'orbit_number': '49329', 'uri': 'ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321205503_49329_eps_o_amv_l2d.bin', 'uid': 'W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321205503_49329_eps_o_amv_l2d.bin', 'origin': '157.249.16.188:9062', 'end_time': datetime.datetime(2022, 3, 21, 20, 58, 3)}
[DEBUG: 2022-03-21 21:38:36 : pytroll_collectors.region_collector] Adding area ID amv_collection to metadata for Metop-B
[DEBUG: 2022-03-21 21:38:36 : pytroll_collectors.region_collector] Platform name Metop-B and sensor avhrr: Start and end times = 20220321 20:55:03 20220321 20:58:03
[DEBUG: 2022-03-21 21:38:36 : trollsched.satpass] kwargs: {'instrument': 'avhrr'}
[DEBUG: 2022-03-21 21:38:36 : trollsched.satpass] instrument: avhrr
[DEBUG: 2022-03-21 21:38:36 : pyorbital.tlefile] Reading TLE from /data/pytroll/tle/tle_db/tle-latest-pytroll.txt
[DEBUG: 2022-03-21 21:38:36 : trollsched.boundary] Instrument = avhrr
[DEBUG: 2022-03-21 21:38:39 : pytroll_collectors.triggers._base] Waiting 4583.930227 seconds until timeout
[DEBUG: 2022-03-21 21:38:39 : pytroll_collectors.triggers._base] mda: {'sensor': 'avhrr', 'platform_name': 'Metop-B', 'start_time': datetime.datetime(2022, 3, 21, 20, 52, 3), 'orbit_number': '49329', 'uri': 'ssh://satellite-wind-processing-b/data/pytroll/global-metop-amv-l2d/W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321205203_49329_eps_o_amv_l2d.bin', 'uid': 'W_XX-EUMETSAT-Darmstadt,SOUNDING+SATELLITE,METOPB+AVHRR_C_EUMP_20220321205203_49329_eps_o_amv_l2d.bin', 'origin': '157.249.16.182:9062', 'end_time': datetime.datetime(2022, 3, 21, 20, 55, 3)}
[DEBUG: 2022-03-21 21:38:39 : pytroll_collectors.region_collector] Adding area ID amv_collection to metadata for Metop-B
[INFO: 2022-03-21 21:38:39 : pytroll_collectors.region_collector] Added expected granule Metop-B (2022-03-21 20:52:03) to area amv_collection
[INFO: 2022-03-21 21:38:39 : pytroll_collectors.region_collector] Collection finished for Metop-B area amv_collection
[DEBUG: 2022-03-21 21:38:39 : pytroll_collectors.triggers._base] Composing topic.
[DEBUG: 2022-03-21 21:38:39 : pytroll_collectors.triggers._base] Is last file added: False
Could very well be. Were there any on-going collections for the other configured sections? The Waiting 4583.930227 seconds until timeout
seems a bit odd being terminated so far ahead of time, but I guess that is becouse timeliness
in the config has been set to 120 minutes, which is more than one full orbit. How large is the collection area?
Hm, looks very big. I don't remember why this is so huge. Centerd around the north pole as far as can understanmd from the config
amv_collection:
description: Norway - Area for amv collection
projection:
proj: stere
lat_0: 90
lon_0: 0
lat_ts: 60
ellps: WGS84
shape:
height: 3776
width: 3776
area_extent:
lower_left_xy: [-7552173.63, -7552173.63]
upper_right_xy: [7552173.63, 7552173.63]
That is indeed a huge area:
from pyproj import Proj
prj = Proj({"proj": "stere", "lat_0": 90, "lon_0": 0, "lat_ts": 60, "ellps": "WGS84"})
print(prj(-7552173.63, -7552173.63, inverse=True))
shows
(-45.0, 6.044472281577602)
So the area goes 6° north of Equator covering most of the northern hemisphere. The description is a bit optimistic for the Norwegian territory :sweat_smile:
Both the TLE environment variable being empty and #329 lead to the geographic gatherer stopping, in case of #329 due to a strange area.
Ah, no, never mind. That was trollflow2, not the gatherer, which silently stops in case of 329.
geographic gatherer stopped working, but did not exit with a value supervisor recognized to trigger a restart. This cause supervisord not to restart the gatherer. Supervisor reported:
The log from geographic gatherer
I guess the metadata does not have an element, but I don't understand why
https://github.com/pytroll/pytroll-collectors/blob/d85bcf98455f1effde259debea9f790e7fa25ba2/pytroll_collectors/triggers/_base.py#L89-L92