Closed ntromas closed 2 weeks ago
Hi Nico,
I'm a little confused about what you're asking for. If you're looking for the full coverage of every position, you can access that in the raw_data files (e.g. https://instrain.readthedocs.io/en/latest/advanced_use.html#accessing-other-data)
If it's something else you're looking for please let me know
Matt
Hi Matt,
Sorry if I was confused. Coverage of all positions would work too (covT right?). What I meant initially is that inStrain won't report if there is no variation and basically ref_freq=1. And inStrain won't report if position coverage is null. Just trying to find a way to differentiate both. But maybe it does not make sense...
Nico
Le mar. 17 oct. 2023 à 12:30, Matt Olm @.***> a écrit :
Hi Nico,
I'm a little confused about what you're asking for. If you're looking for the full coverage of every position, you can access that in the raw_data files (e.g. https://instrain.readthedocs.io/en/latest/advanced_use.html#accessing-other-data )
If it's something else you're looking for please let me know
Matt
— Reply to this email directly, view it on GitHub https://github.com/MrOlm/inStrain/issues/161#issuecomment-1766769869, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABY5D6HP5DOUHRR3CFEMEEDX72XBPAVCNFSM6AAAAAA6EBH6TOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONRWG43DSOBWHE . You are receiving this because you authored the thread.Message ID: @.***>
--
Nicolas Tromas PhD McGill University E-mail: @. @.> Researchgate: NTromasPage https://www.researchgate.net/profile/Nicolas_Tromas Web: http://www.shapirolab.ca/
Hi Matt,
What would be the best way to extract this information? Should I just use coverT.hdf5 (any advice to play with this kind of file?)
Cheers,
Nico
Le mar. 17 oct. 2023 à 12:57, Nicolas Tromas @.***> a écrit :
Hi Matt,
Sorry if I was confused. Coverage of all positions would work too (covT right?). What I meant initially is that inStrain won't report if there is no variation and basically ref_freq=1. And inStrain won't report if position coverage is null. Just trying to find a way to differentiate both. But maybe it does not make sense...
Nico
Le mar. 17 oct. 2023 à 12:30, Matt Olm @.***> a écrit :
Hi Nico,
I'm a little confused about what you're asking for. If you're looking for the full coverage of every position, you can access that in the raw_data files (e.g. https://instrain.readthedocs.io/en/latest/advanced_use.html#accessing-other-data )
If it's something else you're looking for please let me know
Matt
— Reply to this email directly, view it on GitHub https://github.com/MrOlm/inStrain/issues/161#issuecomment-1766769869, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABY5D6HP5DOUHRR3CFEMEEDX72XBPAVCNFSM6AAAAAA6EBH6TOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTONRWG43DSOBWHE . You are receiving this because you authored the thread.Message ID: @.***>
--
Nicolas Tromas PhD McGill University E-mail: @. @.> Researchgate: NTromasPage https://www.researchgate.net/profile/Nicolas_Tromas Web: http://www.shapirolab.ca/
--
Nicolas Tromas PhD McGill University E-mail: @. @.> Researchgate: NTromasPage https://www.researchgate.net/profile/Nicolas_Tromas Web: http://www.shapirolab.ca/
Hi Nico,
To load the covT file, you should do something like the following in python:
import inStrain
import inStain.SNVprofile
import inStrain.profile.profile_utilities
# Load your inStrain profile
IS = inStain.SNVprofile.SNVprofile(`/home/mattolm/inStrainOutputTest/`)
# Extract raw covT
covT = IS.get('covT')
# Declare the scaffold you want to analyze and it's length
scaff = 'test_scaffold'
scaff_length = 10000
# Get the coverage information for that scaffold
scaff_cov = inStrain.profile.profile_utilities.mm_counts_to_counts_shrunk(covT[scaff], fill_zeros=scaff_length)
Once you run something like that, scaff_cov
will be a python dictionary where if you input a 0-based position it will give you the coverage (e.g. scaff_cov[10]
) will report the coverage at position 10 at that scaffold.
Best, Matt
Hi Matt,
Great! Thanks for the help!
Cheers,
Nico
Le mer. 1 nov. 2023 12 h 10, Matt Olm @.***> a écrit :
Hi Nico,
To load the covT file, you should do something like the following in python:
import inStrain import inStain.SNVprofile import inStrain.profile.profile_utilities
Load your inStrain profile
IS = inStain.SNVprofile.SNVprofile(
/home/mattolm/inStrainOutputTest/
)Extract raw covT
covT = IS.get('covT')
Declare the scaffold you want to analyze and it's length
scaff = 'test_scaffold' scaff_length = 10000
Get the coverage information for that scaffold
scaff_cov = inStrain.profile.profile_utilities.mm_counts_to_counts_shrunk(covT[scaff], fill_zeros=scaff_length)
Once you run something like that, scaff_cov will be a python dictionary where if you input a 0-based position it will give you the coverage (e.g. scaff_cov[10]) will report the coverage at position 10 at that scaffold.
Best, Matt
— Reply to this email directly, view it on GitHub https://github.com/MrOlm/inStrain/issues/161#issuecomment-1789236131, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABY5D6HJRT6MCHVA7Q3CSQDYCJX6ZAVCNFSM6AAAAAA6EBH6TOVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTOOBZGIZTMMJTGE . You are receiving this because you authored the thread.Message ID: @.***>
Hi!
My question might be naive but I wonder if there is a simple way to include the position that would have ref_freq = 1 to separate these to the one that won't have any coverage. I know it will generate huge output but we would need this "full" information if that's doable with inStrain.
Cheers!
Nico