MFEh2o / db

**Contains the main issue tracker for the MFE DB!** Functions for interacting with the MFE database, in script format. (See also MFEUtilities, which is an R package that includes many/most of the same functions).
1 stars 0 forks source link

ISOTOPE tables fixes #133

Open kaijagahm opened 3 years ago

kaijagahm commented 3 years ago

a. Zoops should probably have a Z, not a W

b. NA batchIDs

~~c. Should B-B102 actually be B-0102?

d isotopeID’s P-0493 through P-0512 are in IR but not in ISPOC. Add to ISPOC? Last one in ISPOC is indeed P-0492, so seems like not all the data got entered.

e. ISOTOPE_SAMPLES_FISH - Are the fish supposed to match up to FISH_INFO? If so, can just remove some of the digits on the fish number. If not, need to note this. - Try removing 0’s; then match back to FISH_INFO. Should have a 1 in the removed or mortality columns.

kaijagahm commented 3 years ago

Another two issues: f. ISOTOPE_SAMPLES_BENTHIC_INVERTS has many "unknown"s in the metadataID column. There are also many rows (including many of these unknowns) where the metadataID column doesn't match the metadataID portion of the sampleID. It's not clear to me whether this is a SAMPLES or a DATA table--i.e. should I be running a metadataID check at all? Even if no (if it's a data table), then we should still address the "unknown" values. A lot of the other ISOTOPE_SAMPLES tables don't even have a metadataID column. Why does this one?

g. ISOTOPE_SAMPLES_METHANE has 42 rows where the depthBottom in the sampleID is really different from the depthBottom in the depthBottom column. This does not seem to be a case of swapping depthBottom and depthTop--something else is going on.

h. The formatting of the batchID column in ISOTOPE_SAMPLES_FISH (and ISOTOPE_SAMPLES_BENTHIC_INVERTS) is weird: multiple batches, comma-separated. Consider pivoting to long format? The way it is, we can't run a foreign key check with ISOTOPE_BATCHES. Make sure to check the other ISOTOPE_SAMPLES* tables for this formatting: I stopped after ISBI and ISF.

kaijagahm commented 3 years ago

The structure we have here is kind of weird--the ISOTOPE_SAMPLES tables link to the other SAMPLES tables (or in the case of ISOTOPE_SAMPLES_FISH, to the FISH_INFO table). Do we like this 2-layered approach, or should we cut out the middlemen somehow?

But, on the other hand, the columns required in the ISOTOPE_SAMPLES tables might be different than the ones in SAMPLES or FISH_SAMPLES, so I understand why it was done this way.

kaijagahm commented 3 years ago

Will want to update the dbschema to show these tables in a clearer way once these problems are sorted out.

ctsolomon commented 1 year ago

In ISOTOPE_RESULTS, for fish samples (isotopeID F-xxxx) and possibly others: rows giving H isotope results show an isotope ratio of 0.00 in the d13C and d15N columns. This should presumably be an NA.

ctsolomon commented 1 year ago

Re issue (a) in Kaija's original post: I am now part of the way towards resolving this. Mike McGuigan and Stuart found original isotope log files from 2011-2013. I have gone through these to pull out all the zoop isotope sample info. I have the original files and a script that manipulates them to assemble a replacement version of ISOTOPE_SAMPLES_ZOOPS. This is in my files at 'Database\Issues and fixes\GH 133 - ISOTOPE tables fixes'. It needs some additional work to finalize but it's a good start.