replikation / What_the_Phage

WtP: Phage identification via nextflow and docker or singularity
https://mult1fractal.github.io/wtp-documentation/
GNU General Public License v3.0
103 stars 15 forks source link

Filter Results from tools for further application (smatools, taxonomy, heatmap) #9

Closed mult1fractal closed 4 years ago

mult1fractal commented 5 years ago

Metaphinder

Output from tool:

contigID classification ANI [%] merged coverage [%] number of hits size[bp] ctg1 phage 75.357 99.967 71 77616 ctg2 phage 12.049 16.991 61 45788 ctg3 phage 61.742 83.422 80 56341 ctg4 phage 75.595 99.995 55 18684 ctg5 phage 54.507 72.777 12 11233 ctg6 phage 58.68 78.916 32 10420 ctg7 phage 71.843 95.827 47 11046 ctg8 phage 51.998 67.979 14 8713 ctg9 phage 75.518 99.99 52 9716 ctg10 phage 74.659 98.796 47 8641

export LC_NUMERIC=en_US.utf-8
###### print contig only ########
 mkdir sorted_contig_only
 sort  -g  -k4,4 *.txt | awk '$2>=phage' | awk '{ print $1 }' | tail -n+2 > sorted_contig_only/sorted_contig_only.txt

output filterscript : sorted_contig_only.txt

ctg7 ctg11 ctg10 ctg14 ctg1 ctg9 ctg4

printed contigs classified as 'phage' in neue file

mult1fractal commented 5 years ago

Deepvirfinder output

output from tool:

name len score pvalue ctg13 len=5789 5789 0.9999746084213257 0.0008686456681018204 ctg11 len=6697 6697 0.6740242838859558 0.03518014955812373 ctg12 len=4644 4644 0.9999679923057556 0.0010385980814260896 ctg6 len=10420 10420 0.8116341829299927 0.024303195105370497 ctg14 len=21838 21838 0.9995636940002441 0.0027758894176297304 ctg3 len=56341 56341 0.655927836894989 0.03663418687212025 ctg1 len=77616 77616 0.999999463558197 0.0003399048266485384 ctg8 len=8713 8713 0.9723275899887085 0.010423748017221844 ctg10 len=8641 8641 1.0 0.0 ctg9 len=9716 9716 1.0 0.0 ctg2 len=45788 45788 0.17057377099990845 0.15701714631014427 ctg4 len=18684 18684 0.999731719493866 0.002266032177656923 ctg5 len=11233 11233 0.08206796646118164 0.20190346702923181 ctg7 len=11046 11046 0.9856719970703125 0.008422086260291563

export LC_NUMERIC=en_US.utf-8
###### print contig only ########
 mkdir sorted_contig_only
 sort  -g  -k4,4 *.txt | awk '$4>=0.995' | awk '{ print $1 }' | tail -n+2 > sorted_contig_only/sorted_contig_only.txt

Output filterscript: sorted_contig_only.txt

ctg14 ctg4 ctg12 ctg13 ctg1 ctg10 ctg9

Sortiert contigs nach größe [wahrscheinlichster hit unten, weniger wahrscheinlich oben, treshold 95%: nur contigs über qvalue 0.995% werden in neue file geschrieben]

mult1fractal commented 5 years ago

PPR-Meta output

export LC_NUMERIC=en_US.utf-8
###### print contig only ########
mkdir sorted_contig_only
 tail -n+2 *.csv | grep 'phage'| cut -d ' ' -f1 > sorted_contig_only/sorted_contig_only.txt

liefert contig-file von contigs die als Phage markiert wurden

mult1fractal commented 5 years ago

Virfinder output

export LC_NUMERIC=en_US.utf-8
###### print contig only ########
 mkdir sorted_contig_only
 sort  -k5,5 *.txt | awk '$5>=0.75' | awk '{ print $2 }' > sorted_contig_only/sorted_contig_only.txt

der Obere Teil des Nextflow Outputs von Virfinder müsste noch gelöscht werden dann liefert das script eine reine Contig file

mult1fractal commented 5 years ago

Marvel output

Output filter script: OX2_draft

export LC_NUMERIC=en_US.utf-8
###### print contig only ########
 mkdir sorted_contig_only
grep '>' *.txt |awk '$4>=75.0' |awk '{print $2 }' > sorted_contig_only/sorted_contig_only.txt
mult1fractal commented 5 years ago

Virsorter output

output from tool:

category 2

mkdir sorted_contig_only_cat2 cat *2.fasta | grep '>' | cut -d '' -f2 > sorted_contig_only_cat_2/sorted_contig_only.txt

category 3

mkdir sorted_contig_only_cat3 cat *3.fasta | grep '>' | cut -d '' -f2 > sorted_contig_only_cat_3/sorted_contig_only.txt


* output:
ctg1

#### Problem:
* der Name von der CRC-tesfile ist anders aufgebaut... daher ein leicht abgewandelter cut Befehl
* Struktur CRC-testfile:  >VIRSorter_k99_2780941_flag=1_multi=1_0000_len=4161-cat_3

```Shell-Script
#für CRC

cd Predicted_viral_sequences
#category 1
mkdir sorted_contig_only_cat_1
cat *1.fasta | grep '>' | cut -d '_' -f2,3 > sorted_contig_only_cat_1/sorted_contig_only.txt

#category 2
mkdir sorted_contig_only_cat_2
cat *2.fasta | grep '>' | cut -d '_' -f2,3 > sorted_contig_only_cat_2/sorted_contig_only.txt

#category 3
mkdir sorted_contig_only_cat_3
cat *3.fasta | grep '>' | cut -d '_' -f2,3 > sorted_contig_only_cat_3/sorted_contig_only.txt
replikation commented 4 years ago

@Stormrider935 status? this can be closed or? please ref the commits