zavolanlab / htsinfer

Infer metadata for your downstream analysis straight from your RNA-seq data
Apache License 2.0
9 stars 22 forks source link

feat: infer lib type if mate info not in seq IDs #129

Closed BorisYourich closed 1 year ago

BorisYourich commented 1 year ago

Description

I have added a GetOrientation class instance in the GetLibType class so that when the sequence identifiers do not match, the files will be mapped separately. Then the alignments for each file are compared and if at least a cutoff fraction of the reads can be considered concordant, the results.relationship is changed to "split_mates".

Note that this implementation is completely written in Python and will easily become the bottleneck of the program as the logic by which the mappings are compared is all-vs-all for the two mates, so if a high fraction of reads are multi-mapping, the execution will get significantly slower

Fixes #85

Type of change

Please delete options that are not relevant.

Checklist

Please carefully read these items and tick them off if the statements are true or do not apply.

If for some reason you are unable to tick off all boxes, please leave a comment explaining the issue you are facing so that we can work on it together.

codecov[bot] commented 1 year ago

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.01 :tada:

Comparison is base (bf11484) 99.79% compared to head (2a4b122) 99.80%.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## dev #129 +/- ## ========================================== + Coverage 99.79% 99.80% +0.01% ========================================== Files 12 12 Lines 957 1024 +67 ========================================== + Hits 955 1022 +67 Misses 2 2 ``` | [Impacted Files](https://app.codecov.io/gh/zavolanlab/htsinfer/pull/129?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zavolanlab) | Coverage Δ | | |---|---|---| | [htsinfer/cli.py](https://app.codecov.io/gh/zavolanlab/htsinfer/pull/129?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zavolanlab#diff-aHRzaW5mZXIvY2xpLnB5) | `100.00% <100.00%> (ø)` | | | [htsinfer/get\_library\_type.py](https://app.codecov.io/gh/zavolanlab/htsinfer/pull/129?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zavolanlab#diff-aHRzaW5mZXIvZ2V0X2xpYnJhcnlfdHlwZS5weQ==) | `100.00% <100.00%> (ø)` | | | [htsinfer/htsinfer.py](https://app.codecov.io/gh/zavolanlab/htsinfer/pull/129?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zavolanlab#diff-aHRzaW5mZXIvaHRzaW5mZXIucHk=) | `100.00% <100.00%> (ø)` | | | [htsinfer/models.py](https://app.codecov.io/gh/zavolanlab/htsinfer/pull/129?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zavolanlab#diff-aHRzaW5mZXIvbW9kZWxzLnB5) | `100.00% <100.00%> (ø)` | |

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Do you have feedback about the report comment? Let us know in this issue.

balajtimate commented 1 year ago

Sure, I'll run some tests and get back to it soon

BorisYourich commented 1 year ago

Sure, thanks for taking the time, I will fix that soon as possible.

BorisYourich commented 1 year ago

The issue was that I didn't consider reads that do not map at all, there is no need to include such alignments, therefore I added a check.

BorisYourich commented 1 year ago

Okay, I just pushed the changes that Alex recommended, I think you can merge this.