Open bifxcore opened 1 year ago
Hello, good question.
There are two parts to the protocol, the setup (via run_RF2NA.sh) and the model prediction (via predict.py).
The 'PR:' notation is only used as an input in model prediction (via predict.py).
To run the full pipeline (via run_RF2NA.sh) with a paired prediction, use the P:xxx.fa R:xxx.fa notation. The block:
############################################################
# Merge MSAs based on taxonomy ID
############################################################
if [ $nP -eq 1 ] && [ $nD -eq 0 ] && [ $nR -eq 1 ]
then
echo "Creating joint Protein/RNA MSA"
echo " -> Running command: $PIPEDIR/input_prep/make_rna_msa.sh $seqfile $WDIR $tag $CPU $MEM"
$PIPEDIR/input_prep/make_pMSAs_prot_RNA.py $WDIR/$lastP.msa0.a3m $WDIR/$lastR.afa $WDIR/$lastP.$lastR.a3m &> /dev/null
argstring="PR:$WDIR/$lastP.$lastR.a3m:$WDIR/$lastP.hhr:$WDIR/$lastP.atab"
fi
will make the joint MSA.
It should be added that if you do not specify the "P" prefix in front of the input fasta file, it will be recognized as "proteins".
The README says: Use the tag PR:xxx.fa to specify paired protein/RNA.
The code says "Merge MSAs based on taxonomy ID", which I believe makes sense in my case (binding tRNA to synthetase dimer, all from same organism source) but I cannot figure out how to specify the PR flag in the input.
run_RF2NA.sh does not test if [ $type = 'PR' ]
What goes in that 'PR:xxx.fa' fasta file? Concatenation of protein and RNA sequence? What's the sequence separator?