Nomen clature for columns

rdlrt commented 2 years ago

Hi Team,

We were thinking of adding a CIP for normalising variable names (and types) used at query layers using existing providers as baseline. Since a lot of the tooling extracts information via mini-Protocol or dbsync, would be good to know if there is a naming convention (for object representations like block_no , hash) that is preferred to use as source or if there is flexibility/scope to make things more uniform where possible, couple of examples below:

We have columns like hash for block and transaction itself, but then we have hash_raw/view for objects which often use bech32 representation (pool/stake_addresses), while for address representation , we have address_raw/address for hex/bech32 values.
References like value in (tx_out collateral_tx_out) vs amount in account related tables (withdrawal, reward , reserves, datum) vs quantity in asset representations (ma_tx_mint/ma_tx_out)

There are still patterns here, so these are probably in place based on some logic. Depending on preferences (updates to dbsync vs only restrict ourselves to query layers themselves), we might be able to prepare a better initial draft and add issue/suggestions against appropriate destination repositories.

erikd commented 2 years ago

Totally agree that all the columns named amount and quantity should be renamed value.

Not so sure about the first point (or maybe I am missing the point).

rdlrt commented 2 years ago

Those were just examples, there are a lot more that we would want to perhaps drill down and document , but my question was really if there is a preferred source at the moment where the column names come from (for instance, if there is a relation to the names for dbsync objects being initially built from ledger/consensus code-as-spec), if not - we should be able to brainstorm as ecosystem together across orgs/projects to perhaps have a higher level of consistency for object representation.

About first point, I was referring to use of hash vs hash_raw - which are both similar representation, while using view for bech32 representation compared to use of address vs address_raw for bech32 vs hex representation. One option could be calling all such objects to be of the format $obj_[hash]|[view] . Not proposing this as solution rightaway, our aim is to just start preparing a draft for review with the naming for all objects first that would be commonly used in queries and them see if we can adapt/align better across the ecosystem

ghost commented 2 years ago

Adding these as well: I'd like to suggest to rename tx_in_id and tx_out_id in tx_in and collateral_tx_in table to in_tx_id and out_tx_id

In the tx_in table, the tx_in_id column may be mistaken as the primary key, but there's also a id column, so that may cause some confusion.
In the tx_in, collateral_tx_in and ma_tx_out table, there are the same column names tx_out_id but they are referring to different things. The tx_out_id in tx_in and collateral_tx_in tables actually references the pk of tx table, meanwhile the tx_out_id in ma_tx_out is actually referencing the pk of tx_out table.
Renaming it to in_tx_id and out_tx_id enables us to immediately identify that the data in this column are in fact tx_ids, referencing the tx table, but they are input or output transactions denoted by the in and out prepend. This is also in line with the registered_tx_id columns in other tables such as stake_address where the prepend is further describing the transaction.

kderme commented 1 year ago

Many of these changes sound reasonable. However, they would be breaking changes for exisisting applications and would require integration.

rdlrt commented 1 year ago

IMO - a breaking change has been long pending, would be nice to reduce some of the backlog on these (will also allow for more freedom with tx_out table updates)?

IntersectMBO / cardano-db-sync

Nomen clature for columns #1138