farhat-lab / gentb-site

The genTB project, the Django site, variant calling and prediciton pipeline, and mapping pipeline with hooks to two ravens
https://gentb.hms.harvard.edu
Other
8 stars 11 forks source link

Heatmap creation for WDNN pipeline #237

Closed mgro closed 3 years ago

mgro commented 3 years ago

The heatmap is not produced for the WDNN pipeline. I'll post two JSON examples below:

  1. 2.1.pipeline.txt - the default 2.1. pipeline JSON output currently in routine use.

  2. WDNN_pipeline.txt - no heatmap is produced.

Thank you!

doctormo commented 3 years ago

These two files are completely different structures.

Removing that top layer of lists makes the entire format wrong, even if the output contains no mutations, it MUST retain the formatting and include every layer of the output format. The order of the drug was a red-herring.

mgro commented 3 years ago

Ok - so the solution would be to add two empty dictionary-list-lists to this output and retain the formatting, and to avoid ‚retrofitting‘ of the heatmap scripts, right?

doctormo commented 3 years ago

I've added the ability to specify heatmap data without any mutations. This functionality didn't exist before

Now you can add:

[
  $your data here
  {}, {}
]

And this should work.

Screenshot from 2021-05-16 14-35-14

mgro commented 3 years ago

Thanks @doctormo - so just to confirm, this would be the format that'll work: [[$NAME, $DRUG1, $PREDICT1, $FP1, $FN1], [$NAME, $DRUG2, $PREDICT2, $FP2, $FN2], ..., {}, {}]

I would thus take the current output and append two empty dictionaries: final_predict = final_predict + [{} for _ in range(2)]

doctormo commented 3 years ago

Yes!

mgro commented 3 years ago

@doctormo could you check this genTB run: https://gentb.hms.harvard.edu/predict/07dff4bb1c59eb525c653562cd8d4ba2/

Here is the JSON output file that does not get visualized despite the two empty dictionaries added: WDNN.txt

doctormo commented 3 years ago

There is no *matrix.json file in these results. The name MUST end with matrix.json for it to be considered a matrix formatted json data file.

mgro commented 3 years ago

@doctormo now with the .matrix.json ending I'm getting the error 'the matrix file has critical errors and cannot be parsed'. Could you double check if the format is ok? Here's the dataset: https://gentb.hms.harvard.edu/predict/620f23d11bc75b255a859e8458a5ef20/

doctormo commented 3 years ago

You've added two dictionaries to the end of the second order list. Instead of adding the list and the two empty dictionaries to a new first order list.

You did: [A, B] -> [A, B, {}, {}] You need: [A, B] -> [[A, B], {}, {}]

mgro commented 3 years ago

Thank you Martin - works now! Coding is so much fun (esp when it works in the end :))