ucd-library / glm-lightning

Processing the AWS cloud lightning product
MIT License
0 stars 2 forks source link

Energy and area of flashes data #3

Open prakashvs613 opened 2 years ago

prakashvs613 commented 2 years ago

@qjhart

The area in the flashes.csv file is either 0 or 3. The conversion to m2 is discussed in https://github.com/ucd-library/glm-lightning/blob/main/docker/db/initdb.d/glm.org, but the value being either 0 or 3 still did not make sense to me. The energy column is amiss too.

prakashvs613 commented 2 years ago

@qjhart,

I was looking at some of the recent satellite data and here is a lightning message (edited for discerning different flash information in separate lines).

message received with  {'topic': 'lightning', 'message': {'id': 'dfcaa02e-0515-4d47-bddc-53c3595302a6', 'time': '2022-09-20T22:13:04.872Z', 'source': 'generic-payload-parser/parse-lightning.js', 'datacontenttype': 'application/json', 

 'data': {'satellite': 'west', 'product': 'lightning-detection-flash-data', 'date': '2022-09-20', 'hour': '21', 'minsec': '42-20', 'band': '34955336', 'ms': '34955336', 'apid': '302', 'x': -1, 'y': -1, 'datetime': '2022-09-20T21:42:20.000Z', 'payload': [

 {'flash_id': 64117, 'flash_time_offset_of_first_event': -18509, 'flash_time_offset_of_last_event': -18390, 'flash_frame_time_offset_of_first_event': -18180, 'flash_frame_time_offset_of_last_event': -18060, 'flash_lat': 23.504886627197266, 'flash_lon': -105.69198608398438, 'flash_area': 1891, 'flash_energy': 20, 'flash_quality_flag': 0, 'flash_x': 16677, 'flash_y': 5986}, 

 {'flash_id': 64113, 'flash_time_offset_of_first_event': -19526, 'flash_time_offset_of_last_event': -18328, 'flash_frame_time_offset_of_first_event': -19191, 'flash_frame_time_offset_of_last_event': -17992, 'flash_lat': 17.992263793945312, 'flash_lon': -93.50341033935547, 'flash_area': 2217, 'flash_energy': 421, 'flash_quality_flag': 0, 'flash_x': 18721, 'flash_y': 7141}, 

 {'flash_id': 64118, 'flash_time_offset_of_first_event': -18401, 'flash_time_offset_of_last_event': -18381, 'flash_frame_time_offset_of_first_event': -18065, 'flash_frame_time_offset_of_last_event': -18044, 'flash_lat': 14.441597938537598, 'flash_lon': -91.2100601196289, 'flash_area': 1752, 'flash_energy': 30, 'flash_quality_flag': 0, 'flash_x': 19175, 'flash_y': 7861}, 

 {'flash_id': 64115, 'flash_time_offset_of_first_event': -18921, 'flash_time_offset_of_last_event': -17816, 'flash_frame_time_offset_of_first_event': -18586, 'flash_frame_time_offset_of_last_event': -17481, 'flash_lat': 33.991634368896484, 'flash_lon': -105.56458282470703, 'flash_area': 482, 'flash_energy': 53, 'flash_quality_flag': 0, 'flash_x': 16064, 'flash_y': 4127}, 

 {'flash_id': 64116, 'flash_time_offset_of_first_event': -18594, 'flash_time_offset_of_last_event': -17917, 'flash_frame_time_offset_of_first_event': -18248, 'flash_frame_time_offset_of_last_event': -17570, 'flash_lat': 28.592723846435547, 'flash_lon': -82.12464141845703, 'flash_area': 2303, 'flash_energy': 228, 'flash_quality_flag': 0, 'flash_x': 19209, 'flash_y': 5282}, 

 {'flash_id': 64120, 'flash_time_offset_of_first_event': -17913, 'flash_time_offset_of_last_event': -17575, 'flash_frame_time_offset_of_first_event': -17581, 'flash_frame_time_offset_of_last_event': -17242, 'flash_lat': 34.75755310058594, 'flash_lon': -109.89289855957031, 'flash_area': 942, 'flash_energy': 97, 'flash_quality_flag': 0, 'flash_x': 15383, 'flash_y': 3967}, 

 {'flash_id': 64123, 'flash_time_offset_of_first_event': -17566, 'flash_time_offset_of_last_event': -17410, 'flash_frame_time_offset_of_first_event': -17237, 'flash_frame_time_offset_of_last_event': -17080, 'flash_lat': 19.75145149230957, 'flash_lon': -103.19202423095703, 'flash_area': 946, 'flash_energy': 15, 'flash_quality_flag': 0, 'flash_x': 17253, 'flash_y': 6728}, 

 {'flash_id': 64124, 'flash_time_offset_of_first_event': -18295, 'flash_time_offset_of_last_event': -17362, 'flash_frame_time_offset_of_first_event': -17961, 'flash_frame_time_offset_of_last_event': -17028, 'flash_lat': 34.25373077392578, 'flash_lon': -106.01074981689453, 'flash_area': 4338, 'flash_energy': 116, 'flash_quality_flag': 0, 'flash_x': 15982, 'flash_y': 4080}, 

 {'flash_id': 64122, 'flash_time_offset_of_first_event': -17911, 'flash_time_offset_of_last_event': -17296, 'flash_frame_time_offset_of_first_event': -17570, 'flash_frame_time_offset_of_last_event': -16955, 'flash_lat': 15.337690353393555, 'flash_lon': -85.42262268066406, 'flash_area': 2012, 'flash_energy': 246, 'flash_quality_flag': 0, 'flash_x': 19790, 'flash_y': 7721}], 

 'files': ['/west/lightning-detection-flash-data/2022-09-20/21/42-20/34955336/302/payload.json']}}}

Values of area and energy (also, lat-lon) appear to be getting recorded properly. Would you think that the copying of data that satellite returns and what spreadsheet stores are different (I mean, besides the unit change in glm.org file)?

qjhart commented 2 years ago

Looking at that documentation for the GLM dataproducat, https://www-staging.goesr.woc.noaa.gov/users/docs/PUG-GRB-vol4.pdf you can see that both area and energy are quantized values: I'm pretty sure these were added in properly, but there is an issue with the unsigned values, so not sure.

For the energy, I get the histogram below for the CA data. So it at least looks reasonable.

with hist as (
 select width_bucket(energy,0,200,10) as tb,numrange(min(energy)::numeric,max(energy)::numeric) as energy,
count(*) as cnt 
from flash group by 1 order by 1
) 
select tb,energy,cnt,repeat('■',(cnt::float / max(cnt) over() * 30)::int) as bar from hist;

 tb |   energy    |  cnt   |              bar
----+-------------+--------+--------------------------------
  1 | [1,19)      | 144998 | ■■■■■■■■■■■■■■■■■■■■■■■■■
  2 | [20,39)     | 139332 | ■■■■■■■■■■■■■■■■■■■■■■■■
  3 | [40,59)     |  88051 | ■■■■■■■■■■■■■■■
  4 | [60,79)     |  62000 | ■■■■■■■■■■
  5 | [80,99)     |  46002 | ■■■■■■■■
  6 | [100,119)   |  35765 | ■■■■■■
  7 | [120,139)   |  28485 | ■■■■■
  8 | [140,159)   |  23210 | ■■■■
  9 | [160,179)   |  19303 | ■■■
 10 | [180,199)   |  16243 | ■■■
 11 | [200,65535) | 177361 | ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■

One thing that is strange is that in the glm file, I made this function for conversion to [Joules])https://github.com/ucd-library/glm-lightning/blob/main/docker/db/initdb.d/glm.org#energy-and-area), However, the NC file itself sez this:

        short flash_energy(number_of_flashes) ;
                flash_energy:_FillValue = -1s ;
                flash_energy:long_name = "GLM L2+ Lightning Detection: flash radiant energy" ;
                flash_energy:standard_name = "lightning_radiant_energy" ;
                flash_energy:_Unsigned = "true" ;
                flash_energy:valid_range = 0s, -6s ;
                flash_energy:scale_factor = 1.52597e-15f ;
                flash_energy:add_offset = 0.f ;
                flash_energy:units = "J" ;
                flash_energy:coordinates = "group_parent_flash_id flash_id lightning_wavelength flash_tim
e_threshold flash_time_offset_of_first_event flash_time_offset_of_last_event flash_lat flash_lon" ;
                flash_energy:grid_mapping = "goes_lat_lon_projection" ;
                flash_energy:cell_measures = "area: flash_area" ;
                flash_energy:cell_methods = "lightning_wavelength: sum flash_time_offset_of_first_event:
flash_time_offset_of_last_event: sum area: mean (centroid location of constituent events defined by variables group_parent_flash_id and event_parent_group_id weighted by their radiant energies) where cloud" ;

I'm now a little concerned they changed the values midway through the data, and I need to check that.

For area, the functions match, but there seems to be some bias in the information.

with hist as (
 select width_bucket(area,0,2500,10) as tb,numrange(min(area)::numeric,max(area)::numeric) as area,
  count(*) as cnt 
  from flash group by 1 order by 1
) 
select tb,area,cnt,repeat('■',(cnt::float / max(cnt) over() * 30)::int) as bar from hist;

 tb |     area     |  cnt   |              bar
----+--------------+--------+--------------------------------
  1 | [34,107)     |   2899 |
  2 | [437,499)    | 159139 | ■■■■■■■■■■■■■■■■■■■■■■■■■
  3 | [500,613)    |   2908 |
  4 | [872,999)    | 194537 | ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
  5 | [1000,1153)  |   1357 |
  6 | [1309,1499)  | 120169 | ■■■■■■■■■■■■■■■■■■■
  7 | [1500,1749)  |    802 |
  8 | [1750,1999)  | 108089 | ■■■■■■■■■■■■■■■■■
  9 | [2000,2249)  |   4000 | ■
 10 | [2250,2499)  |  46385 | ■■■■■■■
 11 | [2500,65535) | 140465 | ■■■■■■■■■■■■■■■■■■■■■■
(11 rows)

For fun, this article gave me hints on the bar chart.