Open willows-1 opened 3 years ago
@willows-1 Sorry, I just noticed that my proposed solution (expand_dims
) doesn't work either. The actual problem is that the start_time
attribute is slightly different for each band. But you cannot assign multiple values to the same time coordinate in the netCDF file. As a workaround you could average the timestamps or just pick the first one like so:
mytime = scn['B01'].attrs['start_time']
for band in mybands:
scn[band] = scn[band].expand_dims(time=[mytime])
scn.save_datasets(...)
Of course you can also use your alternative solution.
thanks @sfinkens for your advice! will try out if your suggested code works. For my alternative solution, does the code make sense? And will the data be stored and remain the same if I concatanate?
I think so, but you should check that yourself ;)
I think so, but you should check that yourself ;)
yeap I tried it out, and this is the output:
The start time is there but no end time
mytime = scn['B01'].attrs['start_time'] for band in mybands: scn[band] = scn[band].expand_dims(time=[mytime]) scn.save_datasets(...)
Try that solution and check out the
time_bnds
variable in the netCDF file. It contains start and end timestamps. Maybe that is what you are looking for.
I am confused with how to proceed with the code. Is it okay if I paste the code below and help to double check with the code? Below is the entire updated code which includes @sfinkens and @djhoese codes:
from satpy import Scene, MultiScene
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from pyresample import geometry
%matplotlib notebook
import matplotlib.pyplot as plt
from matplotlib import pyplot as plt
from glob import glob
from satpy.scene import Scene
from satpy import find_files_and_readers
from datetime import datetime
import xarray as xr
import numpy as np
import scipy as sp
import numpy as np
filenames = glob('C:/Users/binis/OneDrive/Desktop/To_Binish/ftp_h8_hsd_2pm/16_bands/*20210417_0200*.DAT')
scn = Scene(reader='ahi_hsd', filenames=filenames)
all_names = ['B01', 'B02', 'B03', 'B04', 'B05', 'B06', 'B07', 'B08', 'B09', 'B10', 'B11', 'B12', 'B13', 'B14', 'B15', 'B16']
scn.load(all_names)
mytime = scn['B01'].attrs['start_time']
for band in all_names:
scn[band] = scn[band].expand_dims(time=[mytime])
mytime2 = scn['B01'].attrs['end_time']
for band in all_names:
scn[band] = scn[band].expand_dims(time2=[mytime2])
#crop image
cropped_scn = scn.crop(ll_bbox=(103., 1.,105., 3.))
new_scn = cropped_scn.resample(resampler='native')
new_scn.save_datasets(writer='cf', datasets= all_names, filename='All band data at 0200 version3.nc', exclude_attrs=['raw_metadata'], base_dir = "C:/Users/binis/OneDrive/Desktop/To_Binish/ftp_h8_hsd_2pm")
But @sfinkens how come you included "cn.save_datasets(...) " in the for loop?
Expanding the dimensions twice will make the dataset 4-dimensional (time2
, time
, y
, x
). I guess this is not what you want. Try removing the block
mytime2 = scn['B01'].attrs['end_time']
for band in all_names:
scn[band] = scn[band].expand_dims(time2=[mytime2])
and then check out your generated netCDF file:
In [1]: import xarray as xr
In [2]: ds = xr.open_dataset('myfile.nc')
In [3]: ds['time_bnds']
Out[3]:
<xarray.DataArray 'time_bnds' (time: 1, bnds_1d: 2)>
array([['2021-05-20T03:50:21.126857000', '2021-05-20T03:59:40.525149000']],
dtype='datetime64[ns]')
Coordinates:
* time (time) datetime64[ns] 2021-05-20T03:50:21.126857
Dimensions without coordinates: bnds_1d
There you have one start and end time for the scene.
But @sfinkens how come you included "cn.save_datasets(...) " in the for loop?
It's not, only indented lines belong to the for loop
@sfinkens I am getting this error:
@sfinkens @willows-1 I'm trying to make it work well and here's the my new exported dataset:
If this looks fine, I can make a PR.
@zxdawn this looks fine. Is it possible to send the code here?
I would like to have a look at your code to see how you managed to get the output
@willows-1 I will create a PR and let you know later. I suppose the time_bnds
should be changed to <channel>_time_bnds
, as they may have different times. What do you think @sfinkens ?
@willows-1 I will create a PR and let you know later. I suppose the
time_bnds
should be changed to<channel>_time_bnds
, as they may have different times. What do you think @sfinkens ?
@zxdawn sure thank you. By the way what is PR?
@willows-1 PR is "pull request". Once the PR is merged, you can use the updated satpy to accomplish your task. Here's the PR link.
@zxdawn thank you very much on assisting with this issue. I will look into this and see if its working for me
@zxdawn the code that you mentioned in the PR link considers only one band channel. If I want to include all channels from B01- B16, how can I edit the code below (which you pasted in the PR link) to include all bands?
import xarray as xr
from glob import glob
from satpy import Scene, MultiScene
from satpy.multiscene import timeseries
abi_dir = '../data/GOES-16/ABI_L1/'
abi_name = 'OR_ABI-L1b-RadC-M6C13_G16_s'
channel = 'C13'
reader = 'abi_l1b'
filenames = glob(abi_dir+abi_name+'2020153000*') # two example files
# check the start_time and end_time of each file
scn_1 = Scene([filenames[0]], reader='abi_l1b')
scn_2 = Scene([filenames[1]], reader='abi_l1b')
# get the mscn
mscn = MultiScene.from_files(filenames, reader='abi_l1b')
mscn.load(['C13'])
blended_scene = mscn.blend(blend_function=timeseries)
# save the mscn to nc file
blended_scene.save_datasets(filename='test.nc')
ds = xr.open_dataset('./test.nc')
print(ds, '\n')
print('-'*5, 'C13_start_time')
print(ds.C13_start_time, '\n')
print('-'*5, 'C13_end_time')
print(ds.C13_end_time, '\n')
print('-'*5, 'time_bnds')
print(ds['time_bnds'])
@willows-1 It should be like this mentioned in the Guide:
mscn = MultiScene.from_files(glob('/data/abi/day_1/*C0[12]*.nc'), reader='abi_l1b')
mscn.load(['C01', 'C02'])
@zxdawn this is the code that I edited to include all the band channels:
import xarray as xr
from glob import glob
from satpy import Scene, MultiScene
from satpy.multiscene import timeseries
filenames = glob('C:/Users/binis/OneDrive/Desktop/To_Binish/ftp_h8_hsd_2pm/16_bands/*20210417_0200*.DAT') # two example files
len(filenames)
# check the start_time and end_time of each file
# check the start_time and end_time of each file
scn_1 = Scene([filenames[0]], reader='ahi_hsd')
scn_2 = Scene([filenames[1]], reader='ahi_hsd')
all_names = ['B01', 'B02', 'B03', 'B04', 'B05', 'B06', 'B07', 'B08', 'B09', 'B10', 'B11', 'B12', 'B13', 'B14', 'B15', 'B16']
# get the mscn
mscn = MultiScene.from_files(filenames, reader='ahi_hsd')
mscn.load(all_names)
blended_scene = mscn.blend(blend_function=timeseries)
# save the mscn to nc file
blended_scene.save_datasets(writer='cf', datasets= all_names, filename='test.nc', exclude_attrs=['raw_metadata'], base_dir = "C:/Users/binis/OneDrive/Desktop/To_Binish/ftp_h8_hsd_2pm")
But when I tried to save the dataset, I get this error:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-10-7b7b5aa9c8bf> in <module>
1 # save the mscn to nc file
----> 2 blended_scene.save_datasets(writer='cf', datasets= all_names, filename='test.nc', exclude_attrs=['raw_metadata'], base_dir = "C:/Users/binis/OneDrive/Desktop/To_Binish/ftp_h8_hsd_2pm")
C:\ProgramData\anaconda3\envs\satpy\lib\site-packages\satpy\scene.py in save_datasets(self, writer, filename, datasets, compute, **kwargs)
1036 filename=filename,
1037 **kwargs)
-> 1038 return writer.save_datasets(dataarrays, compute=compute, **save_kwargs)
1039
1040 @staticmethod
C:\ProgramData\anaconda3\envs\satpy\lib\site-packages\satpy\writers\cf_writer.py in save_datasets(self, datasets, filename, groups, header_attrs, engine, epoch, flatten_attrs, exclude_attrs, include_lonlats, pretty, compression, include_orig_name, numeric_name_prefix, **to_netcdf_kwargs)
756 group_datasets, epoch=epoch, flatten_attrs=flatten_attrs, exclude_attrs=exclude_attrs,
757 include_lonlats=include_lonlats, pretty=pretty, compression=compression,
--> 758 include_orig_name=include_orig_name, numeric_name_prefix=numeric_name_prefix)
759 dataset = xr.Dataset(datas)
760 if 'time' in dataset:
C:\ProgramData\anaconda3\envs\satpy\lib\site-packages\satpy\writers\cf_writer.py in _collect_datasets(self, datasets, epoch, flatten_attrs, exclude_attrs, include_lonlats, pretty, compression, include_orig_name, numeric_name_prefix)
654
655 # Check and prepare coordinates
--> 656 assert_xy_unique(datas)
657 link_coords(datas)
658 datas = make_alt_coords_unique(datas, pretty=pretty)
C:\ProgramData\anaconda3\envs\satpy\lib\site-packages\satpy\writers\cf_writer.py in assert_xy_unique(datas)
232 unique_x.add(token_x)
233 if len(unique_x) > 1 or len(unique_y) > 1:
--> 234 raise ValueError('Datasets to be saved in one file (or one group) must have identical projection coordinates. '
235 'Please group them by area or save them in separate files.')
236
ValueError: Datasets to be saved in one file (or one group) must have identical projection coordinates. Please group them by area or save them in separate files.
@willows-1 It seems some files have different projection coordinates. @djhoese may know the problem.
alright
You create two Scene objects and then don't use them. Is this on purpose? My guess on your error is that the MultiScene is accidentally grouping your two sets of files into one scene when it should be two. What do you get when you do:
print(len(mscn.scenes))
How many input files do you have? If you are loading multiple channels then they will have different resolutions and there for different projection coordinates. You would need to resample your MultiScene so that all bands are at the same resolution.
Edit: These are just guesses.
@djhoese I have a total of 160 input files, and 16 band channels, from B01-B16. I have edited the code according to your suggestions. This is the code:
import xarray as xr
from glob import glob
from satpy import Scene, MultiScene
from satpy.multiscene import timeseries
filenames = glob('C:/Users/binis/OneDrive/Desktop/To_Binish/ftp_h8_hsd_2pm/16_bands/*20210417_0200*.DAT') # two example files
len(filenames)
all_names = ['B01', 'B02', 'B03', 'B04', 'B05', 'B06', 'B07', 'B08', 'B09', 'B10', 'B11', 'B12', 'B13', 'B14', 'B15', 'B16']
# get the mscn
mscn = MultiScene.from_files(filenames, reader='ahi_hsd')
mscn.load(all_names)
#crop image
cropped_scn = mscn.crop(ll_bbox=(103., 1.,105., 3.))
new_mscn = cropped_scn.resample(resampler='native')
blended_scene = mscn.blend(blend_function=timeseries)
# save the mscn to nc file
blended_scene.save_datasets(writer='cf', datasets= all_names, filename='test.nc', exclude_attrs=['raw_metadata'], base_dir = "C:/Users/binis/OneDrive/Desktop/To_Binish/ftp_h8_hsd_2pm")
I have done the resampling but I am still getting an error when I save the multi scene:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-9-7b7b5aa9c8bf> in <module>
1 # save the mscn to nc file
----> 2 blended_scene.save_datasets(writer='cf', datasets= all_names, filename='test.nc', exclude_attrs=['raw_metadata'], base_dir = "C:/Users/binis/OneDrive/Desktop/To_Binish/ftp_h8_hsd_2pm")
C:\ProgramData\anaconda3\envs\satpy\lib\site-packages\satpy\scene.py in save_datasets(self, writer, filename, datasets, compute, **kwargs)
1036 filename=filename,
1037 **kwargs)
-> 1038 return writer.save_datasets(dataarrays, compute=compute, **save_kwargs)
1039
1040 @staticmethod
C:\ProgramData\anaconda3\envs\satpy\lib\site-packages\satpy\writers\cf_writer.py in save_datasets(self, datasets, filename, groups, header_attrs, engine, epoch, flatten_attrs, exclude_attrs, include_lonlats, pretty, compression, include_orig_name, numeric_name_prefix, **to_netcdf_kwargs)
756 group_datasets, epoch=epoch, flatten_attrs=flatten_attrs, exclude_attrs=exclude_attrs,
757 include_lonlats=include_lonlats, pretty=pretty, compression=compression,
--> 758 include_orig_name=include_orig_name, numeric_name_prefix=numeric_name_prefix)
759 dataset = xr.Dataset(datas)
760 if 'time' in dataset:
C:\ProgramData\anaconda3\envs\satpy\lib\site-packages\satpy\writers\cf_writer.py in _collect_datasets(self, datasets, epoch, flatten_attrs, exclude_attrs, include_lonlats, pretty, compression, include_orig_name, numeric_name_prefix)
654
655 # Check and prepare coordinates
--> 656 assert_xy_unique(datas)
657 link_coords(datas)
658 datas = make_alt_coords_unique(datas, pretty=pretty)
C:\ProgramData\anaconda3\envs\satpy\lib\site-packages\satpy\writers\cf_writer.py in assert_xy_unique(datas)
232 unique_x.add(token_x)
233 if len(unique_x) > 1 or len(unique_y) > 1:
--> 234 raise ValueError('Datasets to be saved in one file (or one group) must have identical projection coordinates. '
235 'Please group them by area or save them in separate files.')
236
ValueError: Datasets to be saved in one file (or one group) must have identical projection coordinates. Please group them by area or save them in separate files.
Can you do that print(len(mscn.scenes))
?
Also, you have blended_scene = mscn.blend(blend_function=timeseries)
, but you want blended_scene = new_mscn.blend(blend_function=timeseries)
which changes mscn.blend
to new_mscn.blend
.
This is the output of print(len(mscn.scenes))
Can you do that
print(len(mscn.scenes))
?Also, you have
blended_scene = mscn.blend(blend_function=timeseries)
, but you wantblended_scene = new_mscn.blend(blend_function=timeseries)
which changesmscn.blend
tonew_mscn.blend
.
I am still gettting the same error when I changed to blended_scene = new_mscn.blend(blend_function=timeseries):
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-7-7b7b5aa9c8bf> in <module>
1 # save the mscn to nc file
----> 2 blended_scene.save_datasets(writer='cf', datasets= all_names, filename='test.nc', exclude_attrs=['raw_metadata'], base_dir = "C:/Users/binis/OneDrive/Desktop/To_Binish/ftp_h8_hsd_2pm")
C:\ProgramData\anaconda3\envs\satpy\lib\site-packages\satpy\scene.py in save_datasets(self, writer, filename, datasets, compute, **kwargs)
1036 filename=filename,
1037 **kwargs)
-> 1038 return writer.save_datasets(dataarrays, compute=compute, **save_kwargs)
1039
1040 @staticmethod
C:\ProgramData\anaconda3\envs\satpy\lib\site-packages\satpy\writers\cf_writer.py in save_datasets(self, datasets, filename, groups, header_attrs, engine, epoch, flatten_attrs, exclude_attrs, include_lonlats, pretty, compression, include_orig_name, numeric_name_prefix, **to_netcdf_kwargs)
760 if 'time' in dataset:
761 dataset['time_bnds'] = make_time_bounds(start_times,
--> 762 end_times)
763 dataset['time'].attrs['bounds'] = "time_bnds"
764 dataset['time'].attrs['standard_name'] = "time"
C:\ProgramData\anaconda3\envs\satpy\lib\site-packages\xarray\core\dataset.py in __setitem__(self, key, value)
1523
1524 else:
-> 1525 self.update({key: value})
1526
1527 def __delitem__(self, key: Hashable) -> None:
C:\ProgramData\anaconda3\envs\satpy\lib\site-packages\xarray\core\dataset.py in update(self, other)
4095 Dataset.assign
4096 """
-> 4097 merge_result = dataset_update_method(self, other)
4098 return self._replace(inplace=True, **merge_result._asdict())
4099
C:\ProgramData\anaconda3\envs\satpy\lib\site-packages\xarray\core\merge.py in dataset_update_method(dataset, other)
971 priority_arg=1,
972 indexes=indexes, # type: ignore
--> 973 combine_attrs="override",
974 )
C:\ProgramData\anaconda3\envs\satpy\lib\site-packages\xarray\core\merge.py in merge_core(objects, compat, join, combine_attrs, priority_arg, explicit_coords, indexes, fill_value)
619 coerced = coerce_pandas_values(objects)
620 aligned = deep_align(
--> 621 coerced, join=join, copy=False, indexes=indexes, fill_value=fill_value
622 )
623 collected = collect_variables_and_indexes(aligned)
C:\ProgramData\anaconda3\envs\satpy\lib\site-packages\xarray\core\alignment.py in deep_align(objects, join, copy, indexes, exclude, raise_on_invalid, fill_value)
431 indexes=indexes,
432 exclude=exclude,
--> 433 fill_value=fill_value,
434 )
435
C:\ProgramData\anaconda3\envs\satpy\lib\site-packages\xarray\core\alignment.py in align(join, copy, indexes, exclude, fill_value, *objects)
338 if len(unlabeled_sizes | {labeled_size}) > 1:
339 raise ValueError(
--> 340 f"arguments without labels along dimension {dim!r} cannot be "
341 f"aligned because they have different dimension size(s) {unlabeled_sizes!r} "
342 f"than the size of the aligned dimension labels: {labeled_size!r}"
ValueError: arguments without labels along dimension 'time' cannot be aligned because they have different dimension size(s) {1} than the size of the aligned dimension labels: 16
How many time steps do these 160 files represent? A single time? I haven't really been following this discussion, sorry.
If they represent more than one time step then from_files
is not working properly. If you are only using one time step, then why is MultiScene being used?
@djhoese the time error may be related to #1686. I suppose @willows-1 is using the official satpy and the time series length is 16. If @willows-1 uses the modified source codes in #1686, it may work (although that PR isn't all finished).
actually @sfinkens suggested to set a single time scene through the code:
mytime = scn['B01'].attrs['start_time']
for band in mybands:
scn[band] = scn[band].expand_dims(time=[mytime])
scn.save_datasets(...
But @zxdawn used multi scene method. So thats why I try to use multi scene. Actually even I am getting confused too. Initially I used @sfinkens method and @djhoese suggestion and I was able to get the output:
So since my aim is to get the start_time and end_time and the longitude and latitude data, the output above achieved the aim right? Or is there a better way?
@djhoese the time error may be related to #1686. I suppose @willows-1 is using the official satpy and the time series length is 16. If @willows-1 uses the modified source codes in #1686, it may work (although that PR isn't all finished).
@zxdawn so your code might work once the PR is finished?
actually @sfinkens suggested to set a single time scene through the code: ... But @zxdawn used multi scene method. So thats why I try to use multi scene.
As I said, the MultiScene + timeseries approach doesn't work at the moment. I'll look into #1686 soon.
noted, thank you
Hi, I would like to know how I can extract scene metadata (such as start and end time, longitude, latitude, etc) from satellite image? Appreciate all the help I can get!