quinlan-lab / STRling

Detect novel (and reference) STR expansions from short-read data
MIT License
60 stars 9 forks source link

Comparing STR length from STRling, which value would be appropriate to use? #121

Open chanhee22kim opened 4 months ago

chanhee22kim commented 4 months ago

Hello, thank you for providing a great STR detection tool.

I understand that STRling is developed as a tool to detect outlier expansion. I'm wondering if it's appropriate to use this approach for my purpose. I plan to use STRling to uncover STRs and apply a logistic regression model to assess the association between the length of each STR and phenotype.

If it's appropriate, I'm curious about how to interpret each STR length. Currently, I assume that the length can be represented by the value of 40 * [(sum_str_counts) / local_depth]. Is this assumption correct, or would it be more appropriate to use a different output value? In paper, [ log2( (sum_str_counts + 1) / local_depth) ] is used for outlier detection. I wonder which value is more appropriate in the logistic regression with STR track length.

Best regards, Chanhee