plankton_products layers need CSV metadata header

mhidas commented 5 years ago

These new layers in the plankton_products don't have any metadata information in the CSV downloads. Apart from basic time and location columns, all the other columns are names of species or taxonomic groups, which should be self-explanatory. So we don't need a separate entry for each one (some layers have hundreds!), but the units still need to be specified.

ggalibert commented 5 years ago

Current output looks like this:

FID	id	Route	Latitude	Longitude	SampleDateUTC	Year	Month	Day	Time_24hr	Amphipod	Appendicularian_Fritillariidae	...
cpr_zoop_htg_map.fid--1c06a99516a76a74981-2f51	15824	BRFI	-24.6077	157.3176	2012-05-04T21:48:26Z	2012	5	4	21:48:26	0	0.6667	...
cpr_zoop_htg_map.fid--1c06a99516a76a74981-2f50	15825	BRFI	-24.5622	157.6794	2012-05-04T23:51:53Z	2012	5	4	23:51:53	0	0.6667	...

I understood the unit is always the same: "number per cubic metre" or ind/m3. Not sure how the current parameter_mapping harvester could integrate this with ideally a single line but otherwise we could have it next to each taxon label? Something like:

FID	id	Route	Latitude	Longitude	SampleDateUTC	Year	Month	Day	Time_24hr	Amphipod (ind/m3)	Appendicularian_Fritillariidae (ind/m3)	...
cpr_zoop_htg_map.fid--1c06a99516a76a74981-2f51	15824	BRFI	-24.6077	157.3176	2012-05-04T21:48:26Z	2012	5	4	21:48:26	0	0.6667	...
cpr_zoop_htg_map.fid--1c06a99516a76a74981-2f50	15825	BRFI	-24.5622	157.6794	2012-05-04T23:51:53Z	2012	5	4	23:51:53	0	0.6667	...

mhidas commented 5 years ago

That would be possible, though it would require changing the name of every column in the database (and in the harvester). I think it would be easier to add a single line of text at the top to state that all abundances are in "number per cubic metre".

As far as I understand, the parameters mapping harvester/schema ultimately produces a single-column view, which is essentially just lines of text to add to the beginning of the CSV, so we can make it whatever we want. We can still follow the same tabular structure as for all other metadata headers, we just replace the actual column name with a generic "\<taxon group name>".

bpasquer commented 5 years ago

Yes, I confirm that's the way to go in this case 👍

aodn / content

plankton_products layers need CSV metadata header #419