ikmb / MAGScoT

MAGScoT - a MAG scoring and refinement tool
GNU General Public License v3.0
27 stars 2 forks source link

I would like to obtain the *.fa files processed by MAGscot, How to output *.fa files? #3

Closed YeGuoZJU closed 1 year ago

mruehlemann commented 1 year ago

Hi @YeGuoZJU,

you can use other existing tools to go from the *.refined.contig_to_bin.out file to fasta files of the individual MAGs. For example using the extract_fasta_bins.py script from CONCOCT:

mkdir cleanbins cat [sample].refined.contig_to_bin.out | awk '{if(NR==1){print "contig_id,cluster_id"; next}; print $2","$1}' | sed 's/[.]fasta//' | extract_fasta_bins.py [CONTIGS FASTA] /dev/stdin --output_path cleanbins

YeGuoZJU commented 1 year ago

Hi @YeGuoZJU,

you can use other existing tools to go from the *.refined.contig_to_bin.out file to fasta files of the individual MAGs. For example using the extract_fasta_bins.py script from CONCOCT:

mkdir cleanbins cat [sample].refined.contig_to_bin.out | awk '{if(NR==1){print "contig_id,cluster_id"; next}; print $2","$1}' | sed 's/[.]fasta//' | extract_fasta_bins.py [CONTIGS FASTA] /dev/stdin --output_path cleanbins

Thanks for your reply. I have a question about the code .what the mean "sed 's/[.]fasta//'"? Nothing could do. because the file "[sample].refined.contig_to_bin.out " don't have .fasta ? my "[sample].refined.contig_to_bin.out " file show the follow:

cat example.refined.contig_to_bin.out | awk '{if(NR==1){print "contig_id,cluster_id"; next}; print $2","$1}' | head

contig_id,cluster_id example_contig_13943,example_cleanbin_000001 example_contig_30913,example_cleanbin_000001 example_contig_57083,example_cleanbin_000001 example_contig_29070,example_cleanbin_000001 example_contig_5892,example_cleanbin_000001 example_contig_15458,example_cleanbin_000001 example_contig_66470,example_cleanbin_000001 example_contig_60469,example_cleanbin_000001 example_contig_13834,example_cleanbin_000001

mruehlemann commented 1 year ago

Hi, fair enough, that's something that might apply to my own output, but not to yours. sed can be used for basic text transformations - in this case it should simple delete the expression "[.]fasta" (or more accurately replace the first occurence of it per line by nothing)

YeGuoZJU commented 1 year ago

Hi, fair enough, that's something that might apply to my own output, but not to yours. sed can be used for basic text transformations - in this case it should simple delete the expression "[.]fasta" (or more accurately replace the first occurence of it per line by nothing)

Yes. In fact, in your example(on the GitHub), it also delete the expression "[.]fasta" when I run on my platform. The difference may be the version of binning tool we used differently.

You could update the software.......hhhhh. All in all, thanks for your reply.