Closed lbesnard closed 2 years ago
So those files were indexed in the anmn_wave schema. However they're not harvested because there content didn't match what the harvester expected.
Removing the entries from the harvester means that the chef-private databags has to be modified, since currently, these files don't match the regex (which was changed to be more specific). Currently the po_s3_del
command wouldn't work.
Alternatively, the harvester could be changed to remove these entries.
However, the reference to those files is only in the indexed_table. For proof, there is no data associated to these files in both the WMS and WFS
select * from anmn_wave.anmn_wave_map where file_id in (select id from anmn_wave.indexed_file where url like 'IMOS/ANMN/NRS/REAL_TIME/%' and not deleted);
SELECT 0
select * from anmn_wave.anmn_wave_data where file_id in (select id from anmn_wave.indexed_file where url like 'IMOS/ANMN/NRS/REAL_TIME/%' and not deleted)
SELECT 0
I'd suggest not to do anything and just close this issue.
FYI @ggalibert
By the look of it, the ANMN_WAVE harvester which writes in the anmn_wave schema has references to the ANMN NRS wave data files available in http://imos-data.s3-website-ap-southeast-2.amazonaws.com/?prefix=IMOS/ANMN/NRS/REAL_TIME/NRSDAR.
These files are part of the anmn_nrs_dar_yon schema and harvester to the similar name. So I don't understand why there are part of the anmn_wave schema as well.
Anyway, the anmn_wave schema seems to have references to files which don't exist such as:
It seems like this harvester doesn't clean data properly.