nested tables - Githubissues

ekrimmel commented 5 years ago

Email to Bill, 2019-03-05:

We are finding the nested table structure within EMu difficult to work with, and wonder if you have any ideas on what we could do to work smarter, or else give us peace of mind that it is what it is and there's nothing we could be doing better.

The problem is anytime we want to edit a row in one column of a nested table based on the value of the row in another column of the nested table. For instance, Lindsay is working on updating a bunch of insect fossil names and she needed to be able to batch update the Identified By from "unknown" to "W. D. Pierce" and add values for Date Identified and Comments on the original type specimen IDs (i.e. those where Filed As = "Yes"). Some of these specimens also have a second ID row because their taxonomy has been updated, which means that Lindsay had to do what I would consider a pretty awkward workaround to accomplish her goal:

Global replace Date Identified = blank --> Date Identified = "1954"
Global replace "first occurrence only" Identified By = "unknown" --> Identified By = "PLACEHOLDER"
Global replace "first occurrence only" Comment = blank --> Comments = "PLACEHOLDER"
Global replace Identified By = "unknown" --> Identified By "W. D. Pierce"
Global replace Identified By = "PLACEHOLDER" --> Identified By "unknown"
Global replace Comment = blank --> Comments = "...[text]..."
Global replace Comment = "PLACEHOLDER" --> Comments = blank

See irn = 2094730 for an example of a record that was fixed. The above works, but introduces a lot of opportunities for making mistakes with global replace, not to mention takes many more times longer than it should. Any ideas? I know we've discussed the same issue before in other modules, and for some of the instances you've been able to show us a smarter workaround.

ekrimmel commented 5 years ago

Response from Bill:

This is easier as an export and re-import, especially if there are many records to change. Imports also allow you to check and test your data, and your import file is a nice copy of what you've done. I'll work out the steps and send it to you soon.

wlmertz commented 5 years ago

I will finish building an export-import work-around after the EMu re-designs. Although nested tables are useful for storing and organizing disparate but related data, they have a few drawbacks, and this is one of them.

ljwalker commented 4 years ago

Response from Bill:

Can you give me a current example request? I've built an export-import that deals with this issue, but I'd like a real-life example to test and use as a sample for you.

As stated a while back, there is no easy solution because there is no way to know where, within the table, each piece of data resides.

All solutions to this require an import. Unfortunately, Global Replace does not have an "if" statement.

ljwalker commented 4 years ago

Similar to the example outlined by Erica, Daniel is currently updating the taxonomy of our type specimen records, meaning each record will ultimately end up with multiple rows in the Identification List. This can get cumbersome when, after the fact, there's a need to add/edit data in one field when there are multiple rows in the table, meaning you can't tell Global Replace which "box" to fill in with that information. For example, when you want to add an ID Reference to just one row in the table, but this field is blank for all existing rows.

The best, most current, example I can think of was when I had you infill ID Reference with IRN 2989 (Evenhuis, 1994-2019) to a group of records a few weeks ago. In that case, the parameters were to attach IRN 2989 to all rows in the Identification List where Identified by = "Lindsay Walker". A similar group still exists in EMu if you want to access it: "Add Evenhuis 1994-2019 to ID Reference for LW identifications."

wlmertz commented 4 years ago

Here is an example of an import CSV:

irn,Ide2Comments0(2=),Ide2IDRef_tab(1=).irn 2021885,Added Comment,2989

In the above case, catalog IRN 2021885 will have the Comments in the second row modified, and the ID reference in the first row modified. Nothing else will change. All existing data remains intact.

The time-consuming part is determining which row(s) should be modified for each record. This can only be done by exporting the data, sorting and editing, then re-importing.

For now, please contact me when you need these kinds of changes to your data.

LACMIP / emu

nested tables #31