makovalab-psu / DiscoverY

K-mer based classifier for Y-contig identification from Whole Genome Assemblies
MIT License
11 stars 5 forks source link

depth of coverage #15

Open DiegoSafian opened 1 year ago

DiegoSafian commented 1 year ago

Hi ,

I am wondering where in the output the depth of coverage is written??

Thanks

shivanshss commented 1 year ago

This is from the README.

The fasta file will be annotated like the following in the female_only mode:

record_id length_of_contig proportion_shared_with_female

And in female+male mode:

record_id length_of_contig proportion_shared_with_female median_k-mer_abundance

"median_k-mer_abundance" seems to be the coverage you are interested in. You can collect the headers from annotated fasta file using this

awk '/^>/ {print substr($0,2)}' $fasta_file > fasta_headers.tab

You can open fasta_headers.tab in excel and allow spaces and tab to separate data into cells and collect column 4.