Closed peterdesmet closed 1 year ago
@niconoe Would be good if I can define the country year as variables. My attempt only works for the source path, not the destination path:
country="be"
radar="jab"
year="2020"
aws s3 ls lw-enram/$country/$radar/$year/
Pseudo code for copying files:
source_bucket = "s3://lw-enram"
dest_bucket = "s3://aloft"
for path in source_bucket:
# Example source path: "s3://lw-enram/be/jab/2020/02/05/00/bejab_vp_20200205T004000Z_0x9.h5"
# Parse path
radar = dir1 & dir2 # bejab
year = dir3 # 2020
month = dir4 # 02
day = dir5 # 05
file = basename # bejab_vp_20200205T004000Z_0x9.h5
file_ext = extension # h5
# Set source
if year = 2016:
source = "ecog-04003"
else:
source = "baltrad"
# Copy file
if file_ext != "h5"
skip
if file exists at destination:
skip
else:
copy file to {dest_bucket}/{source}/hdf5/{radar}/{year}/{month}/{day}/{file}
# Example dest path: "s3://aloft/baltrad/hdf5/bejab/2020/02/05/bejab_vp_20200205T004000Z_0x9.h5"
Update: implementation in progress (simple Python scripts, just requires the boto3
package).
There is now a consensus on how the repo should be structured, see #65. @niconoe let me know when your code is ready, so we can start copying data.
See this post to flatten file structure
More elaborate example using variables (that currently returns an error):
flyway files:
ecog-04003
baltrad files:
baltrad