Establishing a SQL Template generating library. This PR partially addresses issue #94. Additionally, it resolves issues #92, and #90 (which was previously closed due to the older implementation of filtering).
However, not all SQL queries have been refactored; this task should be continued under #94.
Below is an example of the output log produced by an infrastructure test after applying these changes:
INFO casp.cell_data_manager.sql.query:query.py:46 Rendered SQL Query:
create or replace table `dsp-cell-annotation-service.cas_test_dataset.test_extract_homo_sap__extract_cell_info` partition by range_bucket(extract_bin, generate_array(0, 40000, 10)) cluster by extract_bin as
select cas_cell_index,
cas_ingest_id,
cell_type,
total_mrna_umis,
donor_id,
assay,
development_stage,
disease,
organism,
sex,
tissue,
dataset_filename,
cast(floor((row_number() over () - 1) / 10000) as int) as extract_bin
from `dsp-cell-annotation-service.cas_test_dataset.test_extract_homo_sap__extract_cell_info_randomized` c
This demonstrates how casp/cell_data_manager/sql.templates/prepare_curriculum/prepare_cell_info.sql.mako was rendered during the extraction phase.
Establishing a SQL Template generating library. This PR partially addresses issue #94. Additionally, it resolves issues #92, and #90 (which was previously closed due to the older implementation of filtering). However, not all SQL queries have been refactored; this task should be continued under #94.
Below is an example of the output log produced by an infrastructure test after applying these changes:
This demonstrates how
casp/cell_data_manager/sql.templates/prepare_curriculum/prepare_cell_info.sql.mako
was rendered during the extraction phase.