broadinstitute / seqr

web-based analysis tool for rare disease genomics
GNU Affero General Public License v3.0
176 stars 88 forks source link

First time tagging heterozygous pair on variant search messes up tags on variants #1410

Open NLSVTN opened 4 years ago

NLSVTN commented 4 years ago

When we tag a heterozygous pair on variant search page the first time (create_saved_variant_handler method called), the following piece of code does not correctly fill up 'variantGuids' parameter in each tag which causes a bug when the returned 'variantTagTypes' contains only tags with a single item in 'variantGuids' instead of two for pairs:

https://github.com/macarthur-lab/seqr/blob/master/seqr/views/utils/orm_to_json_utils.py#L416

It then shows tags incorrectly. Reloading the page fixes the issue.

Here is how the bug looks like on a single pair. Before tagging:

Screen Shot 2020-08-04 at 1 58 18 PM

After tagging:

Screen Shot 2020-08-04 at 1 58 33 PM

It can also be an issue of creating/deleting tags when same variants are in different pairs. Not fully sure.

To reproduce the bug you can do the following:

  1. Do the search with such parameters selected:
Screen Shot 2020-08-04 at 2 57 44 PM
  1. Find first two ZNF717 gene pairs and assign pair and individual tags to the first pair

  2. Assign pair tag to the second ZNF717 gene pair and check how the first started to look like

hanars commented 4 years ago

Can you confirm that you are using the latest version of the code when you are experiencing this issue? Can you also send a screenshot of the footer of the page, which includes the seqr version that can help me debug

NLSVTN commented 4 years ago

Ok, will do.

NLSVTN commented 4 years ago

Unfortunately, its not possible because we need to modify postgres due to the new fields added. So, after I pulled the code I am getting an error 'column seqr_individual.sv_flags does not exist LINE 1: ...latform_filters", "seqr_individual"."population", "seqr_indi... ^'. But if there were no changes in the handler that creates the variants and update variant functions, create tags function, the bug should be there, I think. What happens it seems is when we do tagging in variant search, tags are created wrongly, so VariantTag model then contains wrong elements which then show up.

I reversed back to the version that has the bug and it is the following (from the footer): seqr v1.0-3d8d6a9d

NLSVTN commented 4 years ago

No, its not wrong DB modification, sorry, DB seems fine, but the bug still happens.

NLSVTN commented 4 years ago

When the iteration happens here:

https://github.com/macarthur-lab/seqr/blob/master/seqr/views/utils/orm_to_json_utils.py#L421

The tags for pairs of the other pair (not currently being tagged) are returned with only one item in 'variantGuids' list.

NLSVTN commented 4 years ago

It happens because the saved variants in the pair being tagged are not all saved variants there we need to consider to fill up 'variantGuids':

Pair being tagged has variants 1 and 2 Another pair has variants 1 and 3

If we assign a tag to the first pair, we are leaving out the 3rd from the picture, so the 'variantGuids' of the tags assigned to the second pair will have only one item. We either need to exclude these tags from iteration or fill them up correctly.

NLSVTN commented 4 years ago

So, I think this line needs to be modified:

https://github.com/macarthur-lab/seqr/blob/master/seqr/views/utils/orm_to_json_utils.py#L410

I am looking into it, will let you know if I manage to fix it

NLSVTN commented 4 years ago

Ok, I fixed it but not in a beautiful way:

In create_saved_variant_handler I pass 'paired' parameter to the 'get_json_for_saved_variants_with_tags' function:

paired = isinstance(variant_json['variant'], list) 
response_json = get_json_for_saved_variants_with_tags(saved_variants, paired=paired, add_details=True)

Then, in get_json_for_saved_variants_with_tags:

 tag_models = VariantTag.objects.filter(id__in=variant_tag_id_map.keys())                                               

    if paired:                                                                                                             
        variant_ids = saved_variant_id_map.keys()                                                                          
        for idx, variant_id in enumerate(variant_ids):                                                                     
            tag_models = tag_models.filter(saved_variants=variant_ids[idx]) 

So, when we do tagging by pair we limit our tags only to the ones that have both variants in saved_variants list. Maybe you could fix it in a more elegant way.