earthchem / EarthChemAdmin2

Admin tools for Earthchem database person, organization, affiliation, citation
Apache License 2.0
0 stars 0 forks source link

Create 5 Data-To-Go data sets for 2019 Q2 #14

Open sparkji opened 5 years ago

sparkji commented 5 years ago

As per deliverables create 5 Data-To-Go data sets from ECDB. Possible compilations (note, these are merely suggestions and should get KL input): -xenoliths -ophiolites -all basalts (query already completed in Q1 by Bai for RT request by user) -diamonds -spreading centers -volcanoes -cratons

sparkji commented 5 years ago

Baihao has finished all five datasets. @klehnert55 @annikakj , please review and confirm that should we make forward to next step?

annikakj commented 5 years ago

Are there files to review?

annikakj commented 5 years ago

To test: Go to Admin tool: select download>under Query select group>under material refine if desired> under Variable Type refine if desired.

annikakj commented 5 years ago

I have tested the following downloads: Aleutians_All Ophiolite_All Melt Inclusions_All EPR_All EPR_Minerals

A couple of comments:

  1. Mineral species (classification) must be present in all downloads that include minerals
  2. for MI's the classification of the host mineral must be included
  3. MI download has about 1100 samples listed, but the data entry that I did this FAll for DECADE contained close to 4700 samples, all of these were loaded to PetDB too. Why is there such a mismatch between number of samples?
  4. Do we need every download to always have a standardized output of variable headings so that users can combine these downloads with each other and/or with other PetDB downloads?
  5. Basalt download is very large and has not been tested yet.
klehnert55 commented 5 years ago

Thanks, Annika, those are really important observations.

    • absolutely yes
    • absolutely yes
    • I am specifically concerned about the mismatch of MI data between the data-to-go and the load that Annika did.
    • Yes, we should have a standard output for datasets that contain majors, traces, and isotope ratios (radiogenic & stable). I would eliminate U-series and most of the noble gas data except He. Annika and I will work on a list of variables that should be included in a standard output. Can we please get a count of 'number of data points available for each variable.'? We had talked about that at the last meeting. That will help us decide.

5.  - I don't think that it makes sense to have one file with 'all basalts'. We should group by either tectonic setting or region. And we should carefully select rock types. For Mid-Ocean Ridge Basalts, we need to include tholeiites, dolerites, etc., rock types that are closely related to basalts. Once we have defined a standard output format several smaller 'basalt' files can be easily combined.

On 3/20/19 12:18, annikakj wrote:

I have tested the following downloads: Aleutians_All Ophiolite_All Melt Inclusions_All EPR_All EPR_Minerals

A couple of comments:

  1. Mineral species (classification) must be present in all downloads that include minerals
  2. for MI's the classification of the host mineral must be included
  3. MI download has about 1100 samples listed, but the data entry that I did this FAll for DECADE contained close to 4700 samples, all of these were loaded to PetDB too. Why is there such a mismatch between number of samples?
  4. Do we need every download to always have a standardized output of variable headings so that users can combine these downloads with each other and/or with other PetDB downloads?
  5. Basalt download is very large and has not been tested yet.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/iedadata/EarthChemAdmin2/issues/14#issuecomment-474910256, or mute the thread https://github.com/notifications/unsubscribe-auth/AG9LCv1f4jn6_jocwZWDJ744MeA4mfuTks5vYl9vgaJpZM4b8co3.

-- Dr. Kerstin A. Lehnert Doherty Senior Research Scientist Director, Geoinformatics Research Group Director, Interdisciplinary Earth Data Alliance Lamont-Doherty Earth Observatory of Columbia University 61 Route 9W, Palisades, NY, 10964, USA +1 (845) 365-8506

bhchen8 commented 5 years ago

I used the query run in old petdb and found that there are only 'GL' 904 samples.

select count(distinct sample_Id) from inclusion i, sample s, batch b where i.BATCH_NUM= b.BATCH_NUM and b.sample_num = s.SAMPLE_NUM and INCLUSION_TYPE = 'GL'