popbr / data-integration

Apache License 2.0
1 stars 4 forks source link

"Converting" FK into spreadsheets #19

Open aubertc opened 1 year ago

aubertc commented 1 year ago

The situation, as often, will be much more complex with the actual data we will be working with, but I would like for us to try the following.

Imagine we have the following organization:

mock_DB

(the idea being that we populated the Entity_Researcher using the data available, then cleaned it by matching the identical entities, and finally editing GrantDB and PubDB to add an attribute referencing the entity that created them).

We know (I believe) how to output any of those three tables in an excel spreadsheet, but I don't think we know how to preserve the foreign key from e.g., GrantDB to Entity_Researcher. We can always reconstruct it, but can't we instead create (yet) another table that would give all the information in Entity_Researcher plus e.g., the total amount of grant they received, or the number of papers they published in a given year?

Starting with a very small example is probably best :-)