Hi,
I run GRAViTy_Pipeline_I (example) with the following options:
GRAViTy_Pipeline_I \
--GenomeDescTableFile "./Test/Data/Ref/VMR_Test_Ref.txt" \
--ShelveDir "./Test/Analysis/Ref/VI" \
--Database "VI" \
--Database_Header "Baltimore Group" \
--TaxoGrouping_Header "Taxonomic grouping" \
--N_Bootstrap 10 --GenomeSeqFile "./Test/Data/Ref/GenomeSeqs.VI.gb"
The program seems to have been terminated at - Make protein alignments
Here is the complete output:
$ GRAViTy_Pipeline_I \
> --GenomeDescTableFile "./Test/Data/Ref/VMR_Test_Ref.txt" \
> --ShelveDir "./Test/Analysis/Ref/VI" \
> --Database "VI" \
> --Database_Header "Baltimore Group" \
> --TaxoGrouping_Header "Taxonomic grouping" \
> --N_Bootstrap 10 --GenomeSeqFile "./Test/Data/Ref/GenomeSeqs.VI.gb"
Input for ReadGenomeDescTable:
====================================================================================================
Main input
--------------------------------------------------
GenomeDescTableFile: ./Test/Data/Ref/VMR_Test_Ref.txt
ShelveDir: ./Test/Analysis/Ref/VI
Database: VI
Database_Header: Baltimore Group
TaxoGrouping_Header: Taxonomic grouping
TaxoGroupingFile: None
====================================================================================================
################################################################################
#Read the GenomeDesc table #
################################################################################
- Define dir/file paths
to program output shelve
- Read the GenomeDesc table
- Save variables to ReadGenomeDescTable.AllGenomes.shelve
BaltimoreList
OrderList
FamilyList
SubFamList
GenusList
VirusNameList
SeqIDLists
SeqStatusList
TaxoGroupingList
TranslTableList
DatabaseList
- Save variables to ReadGenomeDescTable.CompleteGenomes.shelve
BaltimoreList
OrderList
FamilyList
SubFamList
GenusList
VirusNameList
SeqIDLists
SeqStatusList
TaxoGroupingList
TranslTableList
DatabaseList
Input for PPHMMDBConstruction:
====================================================================================================
Main input
--------------------------------------------------
GenomeSeqFile: ./Test/Data/Ref/GenomeSeqs.VI.gb
ShelveDir: ./Test/Analysis/Ref/VI
Protein extraction options
--------------------------------------------------
ProteinLength_Cutoff: 100
IncludeProteinsFromIncompleteGenomes: True
Protein clustering options
--------------------------------------------------
BLASTp_evalue_Cutoff: 0.001
BLASTp_PercentageIden_Cutoff: 50
BLASTp_QueryCoverage_Cutoff: 75
BLASTp_SubjectCoverage_Cutoff: 75
BLASTp_num_alignments: 1000000
BLASTp_N_CPUs: 88
MUSCLE_GapOpenCost: -3.0
MUSCLE_GapExtendCost: -0.0
ProtClustering_MCLInflation: 2
Protein alignment merging options
--------------------------------------------------
N_AlignmentMerging: 0
HHsuite_evalue_Cutoff: 1e-06
HHsuite_pvalue_Cutoff: 0.05
HHsuite_N_CPUs: 88
HHsuite_QueryCoverage_Cutoff: 85
HHsuite_SubjectCoverage_Cutoff: 85
PPHMMClustering_MCLInflation_ForAlnMerging: 5
HMMER_PPHMMDB_ForEachRoundOfPPHMMMerging: True
====================================================================================================
################################################################################
#Build a database of virus protein profile hidden Markov models (PPHMMs) #
################################################################################
- Define dir/file paths
to BLASTp shelve directory
to BLASTp query file
to BLASTp subject file
to BLASTp output file
to BLASTp bit score matrix file
to protein cluster file
to protein cluster directory
to HMMER shelve directory
to HMMER PPHMM directory
to HMMER PPHMM database directory
to HMMER PPHMM database
to program output shelve
- Retrieve variables
from ReadGenomeDescTable.AllGenomes.shelve
BaltimoreList
OrderList
FamilyList
SubFamList
GenusList
VirusNameList
TaxoGroupingList
SeqIDLists
TranslTableList
- Download GenBank file
GenomeSeqFile doesn't exist. GRAViTy is downloading the GenBank file(s)
Here are the accession numbers to be downloaded:
M14008
AF033809
AF033808
M80216
AF033807
AF074966
AF033822
DQ237904
X03711
AF151794
AF356697
M32690
M25381
U21603
L06906
Y08851
JQ867463
M74895
MF280817
X54482
GU356395
M37980
Y00302
M10455
AY282754
M23385
M10060
AF014792
AF052723
M26927
J02207
AF033813
AY842951
M33677
AF033819
U03982
U94514
KM233624
LC094267
JQ867466
EU010385
U04327
KP143760
To download GenBank file(s), please provide your email: 610262417@qq.com
- Read GenBank file
- Extract/predict protein sequences from virus genomes, excluding proteins with lengthes <100 aa
- ALL-VERSUS-ALL BLASTp
Make BLASTp database
Performe ALL-VERSUS-ALL BLASTp analysis
Save protein-protein similarity scores (BLASTp bit scores)
- Cluster protein sequences based on BLASTp bit scores, using the MCL algorithm
- Make protein alignments
Traceback (most recent call last):
File "/ifs1/User/yuzh/miniconda3/envs/grav/bin/GRAViTy_Pipeline_I", line 956, in <module>
main()
File "/ifs1/User/yuzh/miniconda3/envs/grav/bin/GRAViTy_Pipeline_I", line 798, in main
HMMER_PPHMMDB_ForEachRoundOfPPHMMMerging = str2bool(options.HMMER_PPHMMDB_ForEachRoundOfPPHMMMerging),
File "/ifs1/User/yuzh/miniconda3/envs/grav/lib/python2.7/site-packages/GRAViTy/PPHMMDBConstruction.py", line 649, in PPHMMDBConstruction
"AlignmentLength":AlignIO.read(AlnClusterFile, "fasta").get_alignment_length()
File "/ifs1/User/yuzh/miniconda3/envs/grav/lib/python2.7/site-packages/Bio/AlignIO/__init__.py", line 429, in read
first = next(iterator)
File "/ifs1/User/yuzh/miniconda3/envs/grav/lib/python2.7/site-packages/Bio/AlignIO/__init__.py", line 376, in parse
for a in i:
File "/ifs1/User/yuzh/miniconda3/envs/grav/lib/python2.7/site-packages/Bio/AlignIO/__init__.py", line 279, in _SeqIO_to_alignment_iterator
yield MultipleSeqAlignment(records, alphabet)
File "/ifs1/User/yuzh/miniconda3/envs/grav/lib/python2.7/site-packages/Bio/Align/__init__.py", line 169, in __init__
self.extend(records)
File "/ifs1/User/yuzh/miniconda3/envs/grav/lib/python2.7/site-packages/Bio/Align/__init__.py", line 487, in extend
self._append(rec, expected_length)
File "/ifs1/User/yuzh/miniconda3/envs/grav/lib/python2.7/site-packages/Bio/Align/__init__.py", line 550, in _append
raise ValueError("Sequences must all be the same length")
ValueError: Sequences must all be the same length
Hi, I run GRAViTy_Pipeline_I (example) with the following options: GRAViTy_Pipeline_I \ --GenomeDescTableFile "./Test/Data/Ref/VMR_Test_Ref.txt" \ --ShelveDir "./Test/Analysis/Ref/VI" \ --Database "VI" \ --Database_Header "Baltimore Group" \ --TaxoGrouping_Header "Taxonomic grouping" \ --N_Bootstrap 10 --GenomeSeqFile "./Test/Data/Ref/GenomeSeqs.VI.gb"
The program seems to have been terminated at - Make protein alignments
Here is the complete output: