pacificclimate / pdp

The PCIC Data Portal - Server software to run the entire web application
GNU General Public License v3.0
1 stars 2 forks source link

warn ASCII downloaders data has been packed #315

Open corviday opened 4 months ago

corviday commented 4 months ago

I have been corresponding with a user, James Craig, who was understandably confused that data downloaded in ASCII format is "packed" according to the netCDF standard, which is not indicated anywhere in the downloaded metadata-free CSV files. He suggested having the Output Format dropdown say "ASCII (scaled)" or similar to tip off users that this is occurring.

Not all the netCDF data is packed, though, so we would have to do this on a by-portal or by-dataset basis.

rod-glover commented 4 months ago

Yikes, yes that would be very confusing. Do the ASCII files contain the packing constants at least?

corviday commented 4 months ago

The ASCII files are data-only. Metadata, including packing constants, are accessed separately. One of course hopes all one's users are downloading the metadata to go with the data they download, but we don't have a way to enforce it.

It may not be immediately clear to users what the "scale_factor" and "add_offset" attributes in the metadata mean, if they're not used to netCDF, either.

rod-glover commented 4 months ago

Generally speaking, having data and metadata separate can be a good thing. But I wonder if that is true here (and even whether those packing constants quite qualify as metadata).

Q: Are the packing constants the same across all files, or do they vary per file? If the latter, then getting the right metadata, and knowing how to get it, is as important as getting it at all. This (would) fuels my suspicion that it is not quite truly metadata.

corviday commented 4 months ago

I think in theory, the packing constants should be determined on a per-file basis in order to maximize available precision for that specific dataset. In practice, I have observed that whatever process Stephen uses generates all files in the same batch and variable (ie, all CMIP6 precipitation) with the same packing constants.