CDCgov / prime-reportstream

ReportStream is a public intermediary tool for delivery of data between different parts of the healthcare ecosystem.
https://reportstream.cdc.gov
Creative Commons Zero v1.0 Universal
72 stars 40 forks source link

Identify and update DB queries to be independent of the Receive step #16409

Open arnejduranovic opened 4 weeks ago

arnejduranovic commented 4 weeks ago

User Story

As a developer of ReportStream, I want to ensure all DB queries are independent of the receive step, so that the pipeline steps can be more flexible and support greater use cases.

Description/Use Case

With the addition of the new submission microservice, the first step that will be recorded in the DB is the Convert step, instead of the Receive step. Because of this, all queries that rely on the Receive step as a kind of base-case need to be rewritten to be agnostic of the Receive step.

Some areas to investigate:

Risks/Impacts/Considerations

Dev Notes

We will need to look to identify all queries/feature that apply. A precursory search lends the following two cases:

    fun getRootItem(childReportId: UUID, childIndex: Int, txn: DataAccessTransaction): Item? {
        val cte = itemAncestorGraphCommonTableExpression(
            childReportId,
            childIndex
        )
        val rootItem = DSL.using(txn)
            .withRecursive(cte)
            .select(
                ItemGraphTable.ITEM_GRAPH.asterisk()
            )
            .from(cte)
            .join(REPORT_FILE)
            .on(REPORT_FILE.REPORT_ID.eq(ItemGraphTable.ITEM_GRAPH.PARENT_REPORT_ID))
            .join(ACTION)
            .on(ACTION.ACTION_ID.eq(REPORT_FILE.ACTION_ID))
            .where(ACTION.ACTION_NAME.eq(TaskAction.receive))
            .fetchOneInto(Item::class.java)
        return rootItem
    }
    private fun rootReportRecords(
        txn: DataAccessTransaction,
        cte: CommonTableExpression<Record1<UUID>>,
    ) = DSL.using(txn)
        .withRecursive(cte)
        .select(REPORT_FILE.asterisk())
        .from(cte)
        .join(REPORT_FILE)
        .on(REPORT_FILE.REPORT_ID.eq(cte.field(0, UUID::class.java)))
        .join(ACTION)
        .on(ACTION.ACTION_ID.eq(REPORT_FILE.ACTION_ID))
        .where(ACTION.ACTION_NAME.eq(TaskAction.receive))

The reason rootReportRecords can return more than one record is because a report ID that was created in the send step can contain items from multiple submitted reports (thanks to the batch step). My initial thought here is that instead of relying on the WHERE that looks at the TaskAction, we need to consult with the report_lineage table and only pick reports where REPORT_FILE.REPORT_ID does not have an entry in that table as a child. This may require a new CTE?

Acceptance Criteria

MichaelEsuruoso commented 3 weeks ago

Hey team! Please add your planning poker estimate with Zenhub @adegolier @arnejduranovic @david-navapbc @jack-h-wang @jalbinson @JFisk42 @kant777 @mkalish @thetaurean