aodn / data-services

Scripts which are used to process incoming data in the data ingestion pipeline
GNU General Public License v3.0
1 stars 4 forks source link

po_s3_del does not handle * or ? in filename #511

Open ggalibert opened 8 years ago

ggalibert commented 8 years ago

Using * or ? in filename for po_s3_del is deceptively working and a source of error.

ggalibert@10-aws-syd:~/$ po_s3_del IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T1?3000Z_CBG_FV00_1-hour-avg.nc
Deleting 'IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T1?3000Z_CBG_FV00_1-hour-avg.nc' with index deletion
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T103000Z_CBG_FV00_1-hour-avg.nc'
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T113000Z_CBG_FV00_1-hour-avg.nc'
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T123000Z_CBG_FV00_1-hour-avg.nc'
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T133000Z_CBG_FV00_1-hour-avg.nc'
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T143000Z_CBG_FV00_1-hour-avg.nc'
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T153000Z_CBG_FV00_1-hour-avg.nc'
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T163000Z_CBG_FV00_1-hour-avg.nc'
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T173000Z_CBG_FV00_1-hour-avg.nc'
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/15/IMOS_ACORN_V_20160515T183000Z_CBG_FV00_1-hour-avg.nc'

In the above example the files have been deleted from S3 but not from the database...

Either this should be fixed or better documented somewhere that this functionality cannot be used.

lbesnard commented 8 years ago

I'm even surprised it half worked. I always thought that you had to give the full object path, and that no wildcard was allowed. Reason why you couldn't do things recursively and had to list the objects first and then delete them in a for loop for f in object_path_list; do po_s3_del $f; done

Anyway, if you want to delete them from the database, you can run the command again with the full object path without wildcards. That should work

ghost commented 8 years ago

Looking at the code, it looks to me like it is intended to be per file with no wildcards. That said, it looks like it might half work if supplied with a wildcard:

https://github.com/aodn/data-services/blob/master/profile.d/util.sh#L26

It should also be logging if it fails the unindex operation, so this may be a way to help identify specifically the ones that it's deleted from S3 but not the database. I think the loop method from @lbesnard is probably the safest way to delete en masse but there are references to "bulk" functions there as well, so there may well be another way to trigger it...

ggalibert commented 8 years ago

Thanks, will do the loop method. However I tried these other 2 methods and didn't work unfortunately (would have been handy...):

ggalibert@10-aws-syd:/mnt/imos-data$ find IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/ -type f -wholename "*IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/*.nc" -print0 | xargs -0 po_s3_del
xargs: po_s3_del: No such file or directory
ggalibert@10-aws-syd:/mnt/imos-data$ find IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/ -type f -wholename "*IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/*.nc" -exec po_s3_del {} +
find: `po_s3_del': No such file or directory

while

ggalibert@10-aws-syd:/mnt/imos-data$ find IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/ -type f -wholename "*IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/*.nc" -print0 | xargs -0 -I {} echo {}
IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/IMOS_ACORN_V_20160516T003000Z_CBG_FV00_1-hour-avg.nc
IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/IMOS_ACORN_V_20160516T033000Z_CBG_FV00_1-hour-avg.nc
IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/IMOS_ACORN_V_20160516T043000Z_CBG_FV00_1-hour-avg.nc
IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/IMOS_ACORN_V_20160516T063000Z_CBG_FV00_1-hour-avg.nc
IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/IMOS_ACORN_V_20160516T073000Z_CBG_FV00_1-hour-avg.nc
IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/IMOS_ACORN_V_20160516T083000Z_CBG_FV00_1-hour-avg.nc
IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/IMOS_ACORN_V_20160516T093000Z_CBG_FV00_1-hour-avg.nc
IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/IMOS_ACORN_V_20160516T103000Z_CBG_FV00_1-hour-avg.nc
IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/IMOS_ACORN_V_20160516T113000Z_CBG_FV00_1-hour-avg.nc
IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/IMOS_ACORN_V_20160516T123000Z_CBG_FV00_1-hour-avg.nc
IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/IMOS_ACORN_V_20160516T133000Z_CBG_FV00_1-hour-avg.nc
IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/IMOS_ACORN_V_20160516T143000Z_CBG_FV00_1-hour-avg.nc
IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/IMOS_ACORN_V_20160516T153000Z_CBG_FV00_1-hour-avg.nc
IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/IMOS_ACORN_V_20160516T163000Z_CBG_FV00_1-hour-avg.nc
IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/IMOS_ACORN_V_20160516T173000Z_CBG_FV00_1-hour-avg.nc
IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/IMOS_ACORN_V_20160516T183000Z_CBG_FV00_1-hour-avg.nc
IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/IMOS_ACORN_V_20160516T193000Z_CBG_FV00_1-hour-avg.nc
IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/IMOS_ACORN_V_20160516T203000Z_CBG_FV00_1-hour-avg.nc
IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/IMOS_ACORN_V_20160516T223000Z_CBG_FV00_1-hour-avg.nc
IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/16/IMOS_ACORN_V_20160516T233000Z_CBG_FV00_1-hour-avg.nc

works

ggalibert commented 8 years ago

yep, worked with:

ggalibert@10-aws-syd:/mnt/imos-data$ for file in `find IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/19/ -type f -wholename "*IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/19/*.nc"`; do po_s3_del $file; done
Deleting 'IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/19/IMOS_ACORN_V_20160519T023000Z_CBG_FV00_1-hour-avg.nc' with index deletion
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/19/IMOS_ACORN_V_20160519T023000Z_CBG_FV00_1-hour-avg.nc'
Deleting 'IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/19/IMOS_ACORN_V_20160519T033000Z_CBG_FV00_1-hour-avg.nc' with index deletion
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/19/IMOS_ACORN_V_20160519T033000Z_CBG_FV00_1-hour-avg.nc'
Deleting 'IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/19/IMOS_ACORN_V_20160519T043000Z_CBG_FV00_1-hour-avg.nc' with index deletion
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/19/IMOS_ACORN_V_20160519T043000Z_CBG_FV00_1-hour-avg.nc'
Deleting 'IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/19/IMOS_ACORN_V_20160519T053000Z_CBG_FV00_1-hour-avg.nc' with index deletion
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/19/IMOS_ACORN_V_20160519T053000Z_CBG_FV00_1-hour-avg.nc'
Deleting 'IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/19/IMOS_ACORN_V_20160519T063000Z_CBG_FV00_1-hour-avg.nc' with index deletion
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/19/IMOS_ACORN_V_20160519T063000Z_CBG_FV00_1-hour-avg.nc'
Deleting 'IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/19/IMOS_ACORN_V_20160519T103000Z_CBG_FV00_1-hour-avg.nc' with index deletion
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/19/IMOS_ACORN_V_20160519T103000Z_CBG_FV00_1-hour-avg.nc'
Deleting 'IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/19/IMOS_ACORN_V_20160519T113000Z_CBG_FV00_1-hour-avg.nc' with index deletion
delete: 's3://imos-data/IMOS/ACORN/gridded_1h-avg-current-map_non-QC/CBG/2016/05/19/IMOS_ACORN_V_20160519T113000Z_CBG_FV00_1-hour-avg.nc'
ghost commented 8 years ago

Might be because it's defined as a function and for some reason xargs and find couldn't resolve it, whereas the bash loop keeps it in the shell environment.