Closed MartinLuethi closed 3 years ago
Given that you're the reference for Foxx, any suggestions?
I suggest:
Old Name | New Name |
---|---|
foxx1 | |
foxx2 | |
gull | |
h2015_X | harrington_2015_X |
m2020_X | mcdowell_2020_X |
pow | prince_of_wales |
site_ii | hansen_1958 SEE NOTE1 |
tdN | ? SEE NOTE2 |
NOTE1: We should pick a standard year, and there are often 3: Drill year, data year, and publication year. I think publication year should be primary, but for unpublished data (only the case for very old data, such as TD), then maybe data year is best, and then fall back to drill year. This is just for the name, ideally the meta.bsv
file in each folder contains all 3 fields.
NOTE2: Yes, tdN
is not descriptively named. See the td1 README for the provenance of this data. It comes from a PDF passed around via email from @MartinLuethi to @mankoff. Better provenance would help come up with better names. Until then, I'm open to suggestions.
Vote here by clicking on smiley face at top right of this comment and selecting a graphic to cast your vote.
Actually, I was more thinking about giving a larger region name (e.g. Paakitsooq TD5, Paakitsooq FOXX1), such that it is immediately clear in which area we are. Or maybe by some coordinate code 70N_50W_FOXX1, which is, however, less readable.
Naming the holes after publications is, in my opinion, not very useful, and much data has not been properly published. As for the year, I think the only meaningful year is the main measurement year. Publications often lag by a decade. I am not sure whether the year should be part of the designation, but if so, it would look like
Paakitsooq_FOXX1_2012 Paakitsooq_TD5_1990
Also,` I prefer the station/hole names in upper case, as it is often used in the publications, and it clearly stands out as a designation, and it is also more readable IMHO.
The reference, exact coordinates, etc is then given in the meta file.
* NOTE1: We should pick a standard year, and there are often 3: Drill year, data year, and publication year. I think publication year should be primary, but for unpublished data (only the case for very old data, such as TD), then maybe data year is best, and then fall back to drill year. This is just for the name, ideally the `meta.bsv` file in each folder contains all 3 fields.
clearly the data year or the drill year.
I'm still partial to author_YYYY
(or Author_YYYY
) because that is how I think of most products, and how I hear most colleagues discussing data products. Not the famous ones (e.g. BedMachine, MEaSUREs, OMG, ArcticDEM, etc.), but non-famous products are usually referred to by Author Year
. This may be more true for non-borehole products where things aren't even named (e.g. Mankoff 2020 ice discharge is un-named), but I think holds true for many non-famous named products too.
What about:
Author_pubYYYY_BOREHOLENAME_Location
?BOREHOLENAME_Location
How do we define location
for NGRIP, DYE, etc.? Sometimes it seems like the location comes after the borehole name.
I'm still partial to
author_YYYY
(orAuthor_YYYY
) because that is how I think of most products, and how I hear most colleagues discussing data products. Not the famous ones (e.g. BedMachine, MEaSUREs, OMG, ArcticDEM, etc.)
For ice temperatures, reading Renland, Dye3, NEEM immediately tells me, what I'm looking at/for. I usually have no clue who published what when, so discovering the data sets by interest is not possible.
* If all fields: `Author_pubYYYY_BOREHOLENAME_Location` ? * If no publication: `BOREHOLENAME_Location`
this sounds like a good proposition, although I would always like to have the order
Location_Borholename_Author_pub.
Several fluid-filled boreholes have been remeasured over the year, and this order makes more sense to me.
How do we define
location
for NGRIP, DYE, etc.?
These are clearly defined locations. But there might be more (shallow) holes at NGRIP, NEEM, Eismitte, FOXX etc. For big names (DYE3, NGRIP, ...) it is clear what it is. For one-off project sites like FOXX, TD5 etc there should be a qualification of the area we're in. This is not fully logical, but I think this is how we talk and think about these sites anyway.
For example, I have no idea about the authors or the hole names on Store Glacier, but if the data is called Store_XY_Author_2017 it is immediately obvious where that data set is. And the next hole from a campaign a few years later would be Store_NEWHOLE_Authoress_2022.
Does this make sense?
Yes - that makes sense, and I'll defer to your preferred order. I'll rename things and push updates as the next progress I make on this project, and tag this issue as I do it.
Hey, I am very partial to "Site_MeasurementYear". I think we want to keep site names short enough that they display in GIS OK. I think Fisher/Zdanowicz have laid a good blueprint with "Agassiz77, Agassiz79A, Agassiz79B, Agassiz84". The author and pub year can go in metadata, as far as I'm concerned. Measurement year system would also resolve issues like "TD51" and "TD52" -- which are really just TD5 measured in different years.
Also, just thinking about this more, we need borehole naming consistency between the geothermal and ice temperature databases where possible. We can't use the "AgassizYEAR" series in one database and then "Clark1987_NUMBER" in another database. In the geothermal database we elected not to rename boreholes from their originally published name. That was IHFC convention.
Also, and hopefully last point, the database does have a "Approximate Location" field that is a "More descriptive general location of the site." Why don't you put your SITE_YEAR_AUTHOR_YEAR string in that field, rename the field to "Detailed Name" and make that the primary index field on GitHub? Then you can leave "Site Name" alone.
Good points by @WilliamColgan. I realize now also that @MartinLuethi was referring to the first thing seen when accessing the data - the folder names. It would be good if these are descriptive. However, we do provide a KML map for easy viewing of where each site is. Perhaps the discussion should be more explicit about what people want named what. We have
And at borehole (each folder) in the metadata we have
And then there are also fields for references and years.
I think @MartinLuethi is talking about "Approximate location name", and suggesting that should also be the folder name?
I've made folder name
equal site name
by convention. Perhaps I was too quick in agreeing to change this. So far the only requirement is that site name
matches the Geothermal Database paper (in progress, no link provided here).
Other changes than renaming folder
to some combination of Approximate location name
, Data source
, and Drill year
could be:
meta.bsv
in each folder is generate from that. I'll make that change soon...DB is now here: https://docs.google.com/spreadsheets/d/1QNqnjO7Gocl29Y7X693rCRRZI4dTSq0i58YghWeHy2Q/edit?usp=sharing
Approximate location name
column.Site names have now been updated to format suggested by @MartinLuethi . See DB linked above.
Should the site names be more descriptive? Some of them are, but foxx, td1 and many others are not.
The Meighen ice cap is in the list as Meighan (with a), which is either a new writing, or an error.