Closed lingolingolin closed 4 years ago
Hi @lingolingolin,
Nope it is done at read level in the Whitelist module.
Cheers Ad
thanks @a-slide , do you know how I can trace back which specific reads have been selected?
Hi @a-slide, i have another relevant question:
I always ran into error:
nanocompore.common.NanocomporeError: The result database is empty unless I switched on the --downsample_high_coverage option.
I also tried using picard's DownsampleSam to sample mapped reads and re-run nanocompre accordingly. but it always gave me the empty database error. I doubt this could be due to uneven or not enough coverage of the transcript. so I tried Picard downsampling different amount of reads. I checked the collapsed events file and make sure that coverage in both samples involved in the comparison have common sites with enough coverage. but I had no luck to make nanocompore work.
Do you have any hint what could be the potential reason?
The command I used is below:
nanocompore sampcomp -1 control.downsample.50pcnt.events.collapse.tsv -2 treatment.events.collapse.tsv -f ref.fasta -o nanocompore_samcomp_context4 --pvalue_thr 1 --logit --nthreads 3 --sequence_context 4 --comparison_methods GMM,KS,MW,TT --overwrite
Hi @lingolingolin,
Not directly unfortunately, but it can be done if you run Nanocompore via the API. In that case you can first instantiate a version of Whitelist manually and then get the list of selected reads via the ref_reads
attribute. Then you can pass the whitelist object to SampComp.
Alternatively it is also possible to retrieve the read ids in the database generated by SampComp, but it is very complicated as the database is indexed by reference id and positions. In addition the feature is only implemented in the branch readid_tracking
at the moment
The database is a Python3 shelve that can be open as follow to get the list of read ids
import shelve
db_fn = "Path_to_db"
with shelve.open(db_fn, flag='r') as db:
# iterate to get all the read_ids
db[refid][pos]["data"][cond_lab][sample_lab]["reads"]
I will raise a feature request issue
thanks a lot @a-slide . I will have a try with both approaches.
Hi @a-slide @tleonardi ,
Thanks a lot for developing nanocompore. I have a question related to its 'downsample_high_coverage' option. Is down-sampling done at each reference position/site independent from each other? If this is the case, then it means different reads' events would be used for the analyses, right?
Thanks in advance and I look forward to hearing from you.