EnvGen / POGENOM

Population genomics from metagenomes
BSD 2-Clause "Simplified" License
10 stars 4 forks source link

Calculating Gene-wise Aminoacid Diversity (aa-pi) bug #7

Open jianshu93 opened 2 years ago

jianshu93 commented 2 years ago

Dear Author,

when running the following:

perl pogenom.pl --vcf_file pico127_pico127.002.vcf --out pico127_pico127.002_pi_fst_gene --gff_file pico127_pico127.002.gff --genome_size 2660050 --genetic_code_file standard_genetic_code.txt

I have the following error:

Use of uninitialized value $mod_contig_seq in substr at pogenom.pl line 924. substr outside of string at pogenom.pl line 924.

Any idea? vcf file is generated by the Input_POGENOM.sh and gff is generated by prodigal

Thanks,

Jianshu

jianshu93 commented 2 years ago

If I want to run multiple samples should I change the dataset name to "dataset1,dataset2,dataset3"?

How should I do it

Thanks

Jianshu

lfdelzam commented 2 years ago

question 2) R/ INPUT POGENOM only runs one dataset at a time. It is not possible to use ""dataset1,dataset2,dataset3" as you suggested.

JFsanchezherrero commented 2 years ago

Hi there,

I guess I found a solution for question 1.

Basically, gff file might contain fasta sequence at the end but pogenom.pl does not read properly neither fasta sequence nor genome size from GFF and basically although all the information is included within GFF you need to include --genome_size too and in this case, the FASTA file.

So, the solution for the error is to include the FASTA file too.

perl pogenom.pl --vcf_file pico127_pico127.002.vcf --out pico127_pico127.002_pi_fst_gene --gff_file pico127_pico127.002.gff --genome_size 2660050 --genetic_code_file standard_genetic_code.txt --fasta_file XXXX.fna

It works fine, it produces no error and Gene-wise Aminoacid Diversity (aa-pi) is calculated.

Best regards Jose