SolutionGuidance / psm

Welcome to the Medicare/Medicaid Provider Enrollment Screening Portal
http://projectpsm.org/
Other
26 stars 18 forks source link

Improve database schema #57

Open jasonaowen opened 7 years ago

jasonaowen commented 7 years ago

The way the database is structured is not always obvious or intuitive to me. Track such problems here.

jasonaowen commented 7 years ago

LookupEntity-derived classes (such as ProviderType) all have a code primary key, which is usually (but not always) a two-character string in which numbers are usually (but not always) placed. This limits how many instances we can have, and is confusing. What is the motivation for that? Can we switch to using numbers?

jasonaowen commented 7 years ago

There are several fields defined as CHARACTER VARYING(1) that only ever get the values Y or N; these should be converted into booleans. See, for example, Organization#billingSameAsPrimary.

cecilia-donnelly commented 7 years ago

In the enrollments table, the column now called enrollment_id matches up with a field called ticketId. We need to standardize on one name, likely enrollment_id.

cecilia-donnelly commented 7 years ago

In some places, a foreign key relationship in the database is not linked in the code, or even actually set as a real foreign key in the database itself. For example, in the entities table, profile_id should be a foreign key to the provider_profiles table.

@jasonaowen writes in https://github.com/OpenTechStrategies/psm/pull/107#discussion_r121133674:

Yeah, we should fix this kind of thing to be properly linked in code (with, for example, a ProviderProfile), in Hibernate (with @OneToOne or @ManyToMany or whatever), and in the database (with a foreign key).

cecilia-donnelly commented 7 years ago

Enrollments may be linked to other tables via fields ProfileId and TicketId. @jasonaowen writes in #120:

fields named things like ticketId or profileId suggest to me that there is a relationship there that neither Hibernate nor PostgreSQL know about, but should; I'm holding off on converting those for the moment, but fixing those kinds of things should be part of #57.

cecilia-donnelly commented 6 years ago

Note that currently the way that data is linked to enrollments means that searching for data about all enrollments is an N+1 query (see PR #714), where query-ers need to get all enrollments and then information about each of them as a separate query. We should change this when we reorganize the schema.

jasonaowen commented 6 years ago

There are several created_by fields on various tables, as well as submitted_by and changed_by on enrollments. When those are filled in, they are the text of the username of the user who created the action. Instead, they should be foreign keys into cms_user.

This would have a few benefits:

jasonaowen commented 6 years ago

Some notes from reviewing #817:

slifty commented 6 years ago

This is a general comment that came out of #817 -- there are many fields that should be made not nullable across the entire schema. Work needs to be done to identify which fields, but it relates to this issue.

frankduncan commented 6 years ago

From the provider types report #784, there is an unforced guarantee that entries in the entities table have a provider_type_code. Recorded here for posterity.

cecilia-donnelly commented 6 years ago

Make contact id required for entities (at least providers) -- see discussion in PR #832.

jasonaowen commented 6 years ago

affiliations.acknowledgement_attachment_id seems to be unused; it is only set or read in bindToHibernate and bindFromHibernate as part of the round-trip to the JAXB data type QualifiedProfessionalType; QualifiedProfessionalType.acknowledgementObjectId, in turn, is only set or read in the same methods.

I noticed this while reviewing #908 because the type of the column is TEXT; not only does it not have a foreign key relation to documents or binary_contents, it can't, because the data types are different.

jasonaowen commented 6 years ago

The beneficial_owner table has two columns for interest, own_interest_pct and oth_provider_own_pct. They're currently specified as (SQL-standard) FLOAT, which PostgreSQL represents with double precision. Since these are for user-input decimal numbers, we should instead use (also SQL-standard) data type DECIMAL.

See also https://www.postgresql.org/docs/current/static/datatype-numeric.html .

jasonaowen commented 6 years ago

We should closely examine the various lookup tables (which back @Entitys that derive from LookupEntity) and consider whether they would be better as enums. How likely is it that the values would ever change at run time? Would adding, renaming, or deleting any of the values require a corresponding code change? Are values referred to by hard-coded string constants, particularly those in ViewStatics?

frankduncan commented 6 years ago

status in cms_user table should be not nullable. Related to #33