t27duck / active_reporting

OLAP-like DSL for ActiveRecord-based reporting
MIT License
136 stars 19 forks source link

FactModels Table #37

Open alliedarmour opened 4 years ago

alliedarmour commented 4 years ago

Hey,

sorry if the question is in the wrong place, but I need some help to get started.

I'm currently learning on data warehousing for university and my work and want to build a small data warehouse for a project myself. Now I found this gem and trying to understand this gem's usage.

I made myself familiar with star schema, snowflake schema etc. but I don't really understand how I link the FactTable with the actual table in the database.

I know it should be a table with some belongs_to associations to the dimensions, but how would I specify a migration for this? At least I didn't find an example in the docs.

So consider my following example where I'm modeling a simple cube for reporting details to hospital stays and patients. I have 3 model - patient, case and diagnoses:

create_table "cases", force: :cascade do |t|
    t.string "casenumber"
    t.date "admission_date"
    t.date "discharge_date"
    t.bigint "patient_id"
    t.datetime "created_at", precision: 6, null: false
    t.datetime "updated_at", precision: 6, null: false
    t.index ["patient_id"], name: "index_cases_on_patient_id"
  end

  create_table "diagnoses", force: :cascade do |t|
    t.string "icd_code"
    t.text "icd_description"
    t.bigint "case_id", null: false
    t.datetime "created_at", precision: 6, null: false
    t.datetime "updated_at", precision: 6, null: false
    t.index ["case_id"], name: "index_diagnoses_on_case_id"
  end

  create_table "patients", force: :cascade do |t|
    t.string "first_name"
    t.string "last_name"
    t.date "birthdate"
    t.string "gender", limit: 1
    t.string "domicile"
    t.string "street"
    t.integer "zip_code"
    t.integer "house_number"
    t.datetime "created_at", precision: 6, null: false
    t.datetime "updated_at", precision: 6, null: false
  end

  add_foreign_key "diagnoses", "cases"

Now I have a CaseFactModel with the dimensions patient, case and diagnoses

class CaseFactModel < ActiveReporting::FactModel
    self.model = CaseFact

    dimension :patient
    dimension :case
    dimension :diagnosis

    # dimension_filter :for_admission_date, ->(year) { joins(:case).where('extract(YEAR from admission_date) = ?', year) }
end
class CaseFact < ApplicationRecord
  belongs_to :case
  belongs_to :patient
  belongs_to :diagnosis
end

Now, from what I've read this FactTable has to be a database table aswell, right? In the example test files in the repository I couldn't find any migration related to the fact tables. And how does this FactModel map to this table then, as it's not inheriting from ActiveRecord::Base?

Am I getting something completely wrong here? Any help is appreciated, thanks in advance!

t27duck commented 4 years ago

The fact model (the Ruby classed inheriting from ActiveReporting::FactModel links to a regular active record model.

For example, based on you posted, you should have a Case, Diagnosis, and Patient active record model in your application. By convention, if you have a CaseFactModel, active_reporting should automatically link it to Case which is linked to the cases table.

alliedarmour commented 4 years ago

The fact model (the Ruby classed inheriting from ActiveReporting::FactModel links to a regular active record model.

For example, based on you posted, you should have a Case, Diagnosis, and Patient active record model in your application. By convention, if you have a CaseFactModel, active_reporting should automatically link it to Case which is linked to the cases table.

Thanks for your reply. So if I wanted to store a measure I would not do it in an actual fact table (like in a star schema where this physical table would reference patient, diagnosis and case) but I'd rather do it inside the model itself? For example the cost weight of a case or the age of a patient inside patient, did I got this right?

t27duck commented 4 years ago

The measure is the column stored on the physical database table. So it would go in the database. The Rails model will be able to reference it like any other attribute. The fact model class is used to tie that logic to the gem.

So given a table called orders with a total column that you want to be a measure. You'd have an Order Rails model.

To reference the total column as a measure in a fact model in the gem, you'd have something like...

class OrderFactModel < ActiveReporting::FactModel
  self.measure = :total
end
alliedarmour commented 4 years ago

Makes totally sense, good explanation. Last question: is it possible to have more then one measure which can be set?

Taken the order example, may be total and something like shipment cost or something? Those measures would take the order dimension both.

I think therefor you have to create another FactModel, right?

t27duck commented 4 years ago

As the way it stands right now, correct. There is one measure column per fact model. This is mainly due to there hasn't been a use case (or someone wanting it enough) to come up with such a feature. If I were to allow a fact model to have more than one measure, I would probably allow the measure to be set as an array of values with the first element being the default, and then allow the measure to be set on the metric and/or the report object if the user doesn't want to use the default.