Open simonvadee opened 3 years ago
Mapping is a better word for this concept but should we use plural or singular ?
I don't see any problem about employing singular, as it may happen that we manipulate several of these objects at once.
I also have a suggestion to make, about the Attribute.path
attribute, which has the same name as ElementDefinition.path
but is not the same as a Fhir Path
concept at all. Especially in the FhirResourceTree, this may lead to confusions.
Great initiative @simonvadee 👍
I would totally go for Database
. It's simpler and more straightforward than Credentials.
DatabaseSchema
is also a good idea. Maybe I prefer this one instead of just Schema, in order to be more precise ? Not a strong opinion.
Great question about Source also. If it was me, I would go for something broader for example Project
. So far it seems that the scope & nature of a "source" has indeed been a medical software - Chimio, or Millenium for instance. But as you said @simonvadee it could be a flow HL7 of data. Another reason why I like something like the Project
term is that it puts no bias toward the use of Pyrog : I don't know maybe in the future we will find more convenient to split a mapping between several projects, even if it is the same "source" ? Otherwise maybe we can wait for @elsiehoffet-94 and @nriss on this.
I think Mapping
is indeed a better idea than Ressource. To me what's this object is about is a mapping (well, a set of rules) between a source table (possibly filtered, joinded... and with a PK defined) and a FHIR ressource. What I didn't like about Ressource is that to me it only referred to the latest part of the mapping (the destination). Also, I think Mapping
should stay singular as it doesn't refer to the many "column => FHIR attributes" mappings inside it, but the broader correspondance between one source table and a FHIR ressource. For example this one
Column
: don't have a strong opinion on this one
Yes, a source
should definitely be renamed, and project
suits better and is more flexible (many projects for one database, different kinds of data origin..).
Regarding the mapping
it seems fine by me, but I can anticipate some confusions : what about code mappings (between terminologies, aka conceptmaps
), and how do we call the DBT rules
? @nriss any idea about the latter?
Credentials vs Database: we currently use Credential to refer to database connection informations (host,port,login,password,database name) but it can be confusing ("credentials of what ? a user ? to what ?").
I suggest to use the same word as airbyte: connection
Just an idea to think about: what if these connections are set in another part of pyrog and then when we want to create a source, we can choose a predefined connection.
Owner vs Schema: a database may contain many "schemas" . I think back in the days @Jasopaum and I were confused between the difference between both terms (and I think that the same word has a different meaning in postgres, mssql and oracle). We currently use Owner but I think DatabaseSchema or just Schema would be more accurate.
In airbyte, the form is updated depending on the choice of db
Source vs something else ?: a Source has a Crendential (ie: it is linked to a database) and has many Resource. It is meant to represent a "source of information" from which we want to be able to extract data in order to create FHIR resources. For now, it can only be an SQL database, and maybe it's fine until the pyrog scope remains unclear. For instance, when the data source is a "flux" (eg: a SFTP server with csv files), do we want pyrog/river to be aware of this ? This comes back to the datalake question and I'm not sure we want to address this here. However, don't hesitate if you have suggestions!
I agree with you @elsiehoffet-94 and @MiskoG, project
seems great, i don't see any better word for now
Resource vs Mapping or Mappings: This is the term for which we have the most ambiguity right now. It is called Resource in the back and Mapping in the webapp (lol what a great idea we had). I think we all agree that Resource is too vague and refers to too many concepts (even in software engineering in general). Mapping is a better word for this concept but should we use plural or singular ?
Mapping
is ok for me. Why are you hesitating between singular or plural ? It depends on the situation, no ? I don't have any idea about what is the best
What are you calling DBT rules
@elsiehoffet-94 ? It is the sql request that generate the dbt views ? According to me, there is no need to name that because it is seen as a classical table on pyrog
Column: is meant to represent a database column, but it also has table and owner fields. I think this one is fine (until we normalize the schema and use a single Column object for a column of the database) but I mention it anyway.
👍
This is meant to be an open discussion before actually starting to rename stuff across the back and the front.
Problem
We don't use the same terminology to reference the same concepts across the web application and the back-end. I think we can use this discussion to discuss naming in general (where it can be controversial) and use this occasion to harmonize the terminology we use.
Description
Credentials
vsDatabase
: we currently useCredential
to refer to database connection informations (host,port,login,password,database name) but it can be confusing ("credentials of what ? a user ? to what ?").Owner
vsSchema
: a database may contain many "schemas" . I think back in the days @Jasopaum and I were confused between the difference between both terms (and I think that the same word has a different meaning in postgres, mssql and oracle). We currently useOwner
but I thinkDatabaseSchema
or justSchema
would be more accurate.Source
vs something else ?: aSource
has aCrendential
(ie: it is linked to a database) and has manyResource
. It is meant to represent a "source of information" from which we want to be able to extract data in order to create FHIR resources. For now, it can only be an SQL database, and maybe it's fine until the pyrog scope remains unclear. For instance, when the data source is a "flux" (eg: a SFTP server with csv files), do we want pyrog/river to be aware of this ? This comes back to the datalake question and I'm not sure we want to address this here. However, don't hesitate if you have suggestions!Resource
vsMapping
orMappings
: This is the term for which we have the most ambiguity right now. It is calledResource
in the back andMapping
in the webapp (lol what a great idea we had). I think we all agree thatResource
is too vague and refers to too many concepts (even in software engineering in general).Mapping
is a better word for this concept but should we use plural or singular ?Column
: is meant to represent a database column, but it also hastable
andowner
fields. I think this one is fine (until we normalize the schema and use a singleColumn
object for a column of the database) but I mention it anyway.Implementation
First, let's agree on the naming. Then, we can do one PR for a single concept renaming (it means a new database migration in the back and updating the front and back code) at a time.