secastel / phaser

phasing and Allele Specific Expression from RNA-seq
GNU General Public License v3.0
107 stars 37 forks source link

phaser annotate module is still py2.7? #81

Closed FunongLuo closed 3 months ago

FunongLuo commented 4 months ago

Hi @secastel In README file: Runs on Python 2.7.x has the following dependencies: pysam, pyVCF NOTE pyVCF is not compatible with Python 3+ so this script cannot be updated. But I just install pyvcf in python3.9, I am not sure if an additional environment for py2.7 needs to be built, because I noticed that except for the phaser-annotate module, all other modules are based on py3.

secastel commented 3 months ago

This is correct, other than phASER annotate, all other phASER components are based on python 3. In my testing I could not get pyVCF to install with pip in a python 3 environment. If you plan use the phASER annotate functionality you will need to create an environment for python 2.7 unfortunately. I apologize for the inconvenience.

FunongLuo commented 3 months ago

Thanks for your reply. other question: The allelic differential effect of genes I noticed involves the directionality of reads from haplotype A divided by haplotype B, i.e., when log2aFC is negative, the read count from haplotype A is less than haplotype B’s read counts. Currently, I have phased VCF files of the genome(10X depth, by beagle), which can serve as a true set. If gene A’s SNP phase situation is such that: GENE A 1 1121066 0|1 GENE A 1 1121145 0|1 GENE A 1 1121210 1|0 GENE A 1 1121210 0|1 GENE B 1 2221278 1|1 GENE B 1 2221401 0|1 GENE B 1 2221456 0|1 GENE B 1 2221510 1|0 Can it be said that the read counts corresponding to haplotype A come from | the left-side statistics (0,0,1,0), and the read counts corresponding to haplotype B come from | the right-side statistics (1,1,0,1). if phaser_case_gene_ae.txt result is: name aCount bCount totalCount log2_aFC GENE A 19 30 49 -0.6589630821649332 GENE B 93 46 139 1.0155968550510186 Can I say that gene A’s aCount comes from haplotype A (0,0,1,0) and BCount comes from haplotype B (1,1,0,1); gene B’s aCount comes from haplotype A (1001) and BCount comes from haplotype B (1110) ? Are the directions of haplotype A and haplotype B consistent for different genes located on the same chromosome? Are A counts all from the left side of the ”|“ symbol, and B counts all from the right side of the ”|“ symbol?

secastel commented 2 months ago

Yes, that is the correct interpretation!