CompEpigen / figeno

Tool for plotting sequencing data along genomic coordinates.
GNU General Public License v3.0
240 stars 8 forks source link

MAG support #12

Closed Ge0rges closed 2 months ago

Ge0rges commented 2 months ago

Hello,

Neat tool. I was wondering if you would consider making some changes to allow for easier visualization of MAGs. For example, it would be nice if the reference genome could simply be a FASTA, with gene information perhaps provided by something uploading the output of prodigal.

Then the "add all chromosomes" button should read that FASTA file to get all the contig names.

e-sollier commented 2 months ago

Hi,

Currently the "add all chromosomes" option completely ignores the reference, and just adds the human chromosomes. I can of course update it to take the reference into account. But then the question is what reference file would be appropriate. For now figeno only supports RefSeq or gtf files for gene annotations. I am not familiar with prodigal but apparently it can output gene predictions in gff3 format. I can add support for gff3 format, if that would be useful for you? Then to get the list of contig names, I guess I could simply extract them for the gff3 files? This single file would provide both the genes and list of contigs. A fasta file (or even just the index .fai) could provide the list of contigs as well, but if this information is already provided in the gff3 file this might not be necessary. Let me know what you think. Would simply adding support for gff3, and making the "add all chromosomes" take it into account, be sufficient for your purposes?

Ge0rges commented 2 months ago

Hello,

I think it would be more versatile to get the contigs from an fasta index, genome from FASTA, and genes from GFF3 or something like a bed file.

thanks!

e-sollier commented 2 months ago

OK, I'll try to add support for the fasta index and gff3 format for genes. But I don't see the need for a fasta file? Figeno does not use the sequence information from a fasta.

Ge0rges commented 2 months ago

Ah got it, my mistake I thought you might use the sequence.

e-sollier commented 2 months ago

I've added support for gff3 format (for gene annotations) and .fai files (to get the list of contigs) in the latest update 1.3.1, which you can install with pip install figeno==1.3.1. The .fai file can be used instead of a cytoband file, when using a custom reference. If you provide such a .fai file, then the "add all chromosomes" button will add all the contigs present in the .fai file.

Let me know if this works, and if you have other suggestions!

Ge0rges commented 2 months ago

Thanks for adding support quickly, I'll try it out as soon as that update is distributed.