biocorecrg / allele_specific_RNAseq

Allele Specific RNAseq
9 stars 2 forks source link

parseVCF.py usage #11

Closed Jacq3lin3 closed 4 years ago

Jacq3lin3 commented 4 years ago

Hi Luca,

For reproducibility and ease to use the pipeline, could you please add some information in the README about how to run parseVCF.py? So in the future other people in the lab can use it more easily.

For example: which python version should I use for parseVCF.py?

My default it version 3

> python2 --version
Python 2.7.5

> python --version
Python 3.4.5

However I get this silly error when running it with python3:

> singularity exec -e asrnaseq_0.2.sif python ../projects/01_mRNAseq_PGCLC/allele_specific_RNAseq_v5/bin/parseVCF.py -h
  File "../projects/01_mRNAseq_PGCLC/allele_specific_RNAseq_v5/bin/parseVCF.py", line 36
    os.system("gzip "+ opts.wotus) 
                                  ^
TabError: inconsistent use of tabs and spaces in indentation

But when I run it with python2 I have no problems.

Also a minor thing regarding the usage of parseVCF.py

> singularity exec -e asrnaseq_0.2.sif python2 ../projects/01_mRNAseq_PGCLC/allele_specific_RNAseq_v5/bin/parseVCF.py -1 CAST_EiJ -2 129S1_SvImJ -i mgp.v5.merged.snps_all.dbSNP142.vcf.gz -o CAST_EiJ-129S1_SvImJ.vcf -g GRCm38_68.fa
Error: The requested file (CAST_EiJ-129S1_SvImJ.vcf) could not be opened. Error message: (No such file or directory). Exiting!
gzip: GRCm38_68.fa.masked: No such file or directory
gzip: CAST_EiJ-129S1_SvImJ.vcf: No such file or directory

So I did touch CAST_EiJ-129S1_SvImJ.vcf and run again and it went fine and I do get the masked genome and the compressed CAST_EiJ-129S1 VCF file.

> ls references
CAST_EiJ-129S1_SvImJ.vcf.gz  GRCm38_68.fa.masked.gz  cas                                          mgp.v5.merged.indels.dbSNP142.normed.vcf.gz.tbi
GRCm38_68.fa                 asrnaseq_0.2.sif        mgp.v5.merged.indels.dbSNP142.normed.vcf.gz  mus

Thanks! Jackie

lucacozzuto commented 4 years ago

Fixed