humlab-sead / sead_query_api

.NET Core port of Query SEAD API
Other
2 stars 0 forks source link

Incorrect number of analysis_entities returned for result-load #134

Open johanvonboer opened 5 months ago

johanvonboer commented 5 months ago

This example shows Reykholt having 1 analysis_entity: image

But it has many more.

This is the SQL returned from the server:

SELECT 
    alias_1, 
    ARRAY_TO_STRING(ARRAY_AGG(DISTINCT alias_2), ',') AS text_agg_of_alias_2, 
    COUNT(alias_3) AS count_of_alias_3, 
    alias_4, 
    alias_5
FROM (
    SELECT 
        tbl_sites.site_name AS alias_1, 
        tbl_record_types.record_type_name AS alias_2, 
        tbl_analysis_entities.analysis_entity_id AS alias_3, 
        tbl_sites.site_id AS alias_4, 
        tbl_sites.site_id AS alias_5, 
        tbl_sites.site_name AS alias_6
    FROM 
        tbl_analysis_entities
    INNER JOIN 
        tbl_datasets 
        ON tbl_datasets.dataset_id = tbl_analysis_entities.dataset_id
    INNER JOIN 
        tbl_physical_samples 
        ON tbl_physical_samples.physical_sample_id = tbl_analysis_entities.physical_sample_id
    INNER JOIN 
        tbl_sample_groups 
        ON tbl_sample_groups.sample_group_id = tbl_physical_samples.sample_group_id
    INNER JOIN 
        tbl_sites 
        ON tbl_sites.site_id = tbl_sample_groups.site_id
    INNER JOIN 
        tbl_abundances 
        ON tbl_abundances.analysis_entity_id = tbl_analysis_entities.analysis_entity_id
    INNER JOIN 
        tbl_taxa_tree_master 
        ON tbl_taxa_tree_master.taxon_id = tbl_abundances.taxon_id
    INNER JOIN 
        tbl_taxa_tree_authors 
        ON tbl_taxa_tree_authors.author_id = tbl_taxa_tree_master.author_id
    INNER JOIN 
        tbl_taxa_tree_genera 
        ON tbl_taxa_tree_genera.genus_id = tbl_taxa_tree_master.genus_id
    INNER JOIN 
        tbl_methods 
        ON tbl_methods.method_id = tbl_datasets.method_id
    INNER JOIN 
        tbl_record_types 
        ON tbl_record_types.record_type_id = tbl_methods.record_type_id
    WHERE 
        1 = 1
        AND tbl_taxa_tree_master.taxon_id::text IN ('37303')
        AND tbl_datasets.method_id IN (3, 6)
        AND tbl_sites.site_id IS NOT NULL
    GROUP BY 
        alias_1, alias_2, alias_3, alias_4, alias_5, alias_6
) AS X
GROUP BY 
    alias_1, alias_4, alias_5, alias_6
ORDER BY 
    alias_6;
roger-mahler commented 5 months ago

This SQL returns only one analysis entity. Why should there be more?

select
      tbl_sites.site_name                      as alias_1,
      tbl_record_types.record_type_name        as alias_2,
      tbl_analysis_entities.analysis_entity_id as alias_3,
      tbl_sites.site_id                        as alias_4,
      tbl_sites.site_id                        as alias_5,
      tbl_sites.site_name                      as alias_6,
      facet.abundance_taxon_shortcut.taxon_name
from tbl_analysis_entities
inner join tbl_datasets on tbl_datasets."dataset_id" = tbl_analysis_entities."dataset_id"
inner join tbl_physical_samples on tbl_physical_samples."physical_sample_id" = tbl_analysis_entities."physical_sample_id"
inner join tbl_sample_groups on tbl_sample_groups."sample_group_id" = tbl_physical_samples."sample_group_id"
inner join tbl_sites on tbl_sites."site_id" = tbl_sample_groups."site_id"
inner join facet.abundance_taxon_shortcut on facet.abundance_taxon_shortcut."analysis_entity_id" = tbl_analysis_entities."analysis_entity_id"
left join tbl_methods on tbl_methods."method_id" = tbl_datasets."method_id"
left join tbl_record_types on tbl_record_types."record_type_id" = tbl_methods."record_type_id"
where 1 = 1
  and (facet.abundance_taxon_shortcut.taxon_id::text in ('37303'))
  and tbl_datasets.method_id in (3, 6)
  and tbl_sites.site_id = 4602
johanvonboer commented 5 months ago

Because it's impossible for a site which contains multiple datasets to only have 1 analysis_entity, and because of this sql:

  SELECT * FROM tbl_sites
  JOIN tbl_sample_groups ON tbl_sample_groups.site_id=tbl_sites.site_id
  JOIN tbl_physical_samples ON tbl_physical_samples.sample_group_id=tbl_sample_groups.sample_group_id
  JOIN tbl_analysis_entities ON tbl_analysis_entities.physical_sample_id=tbl_physical_samples.physical_sample_id
  WHERE tbl_sites.site_id=4602
roger-mahler commented 5 months ago

But your example has a filter on taxon_id 37303?