blobtoolkit / viewer

[Archived] BlobToolKit API and viewer code
http://blobtoolkit.genomehubs.org
MIT License
6 stars 1 forks source link

Pie chart - BUSCO version 4 #4

Closed akaraw closed 4 years ago

akaraw commented 4 years ago

Hi,

I am using the blobtools pie chart with BUSCO version 4. But the pie chart which blobtools view gives me wrong values(values are wrong). Would you be able to help me on this?

Thank you,

rjchallis commented 4 years ago

Happy to help fix this. Could you give me some more details so I can try to work out what is happening here. This could be a bug in the Viewer or BlobTools2 so it would be helpful to know:

What is the difference between the expected and displayed values? Are the values in the BUSCO table correct, or are both the table and pie chart wrong? Did you import directly with BlobTools2 or did you use the Pipeline? If you run ./blobtools2 blobtools filter --summary summary.json /path/to/dataset, are the BUSCO values in the summary.json file also wrong?

Could you also let me see the output of git show --oneline -s for the Viewer and BlobTools2 repositories (and the Pipeline if you used it) and would you be able to share the BUSCO short summary and full table outputs so I can try to debug this.

akaraw commented 4 years ago

Hi @rjchallis

Thank you very much for the great tool and swift reply.

I ran the blobtools filter --summary summary.json /path/to/dataset:

}, "busco": { "aves_odb10": { "c": 8024, "d": 16, "m": -8048, "f": 86, "t": 62, "s": 8008, "string": "C:12941.9%[S:12916.1%,D:25.8%],F:138.7%,M:-12980.6%,n:62" }, "vertebrata_odb10": { "c": 3228, "d": 20, "m": -3194, "f": 33, "t": 67, "s": 3208, "string": "C:4817.9%[S:4788.1%,D:29.9%],F:49.3%,M:-4767.2%,n:67" }

But the actual values are:

BUSCO version is: 4.0.4

The lineage dataset is: vertebrata_odb10 (Creation date: 2019-11-20, number of species: 3354, number of BUSCOs: 67)

Summarized benchmarking in BUSCO notation for file sp.fasta

BUSCO was run in mode: genome

    ***** Results: *****

    C:96.2%[S:95.6%,D:0.6%],F:1.0%,M:2.8%,n:3354
    3228    Complete BUSCOs (C)
    3208    Complete and single-copy BUSCOs (S)
    20      Complete and duplicated BUSCOs (D)
    33      Fragmented BUSCOs (F)
    93      Missing BUSCOs (M)
    3354    Total BUSCO groups searched

BUSCO version is: 4.0.4

The lineage dataset is: aves_odb10 (Creation date: 2019-11-20, number of species: 8338, number of BUSCOs: 62)

Summarized benchmarking in BUSCO notation for file sp.fasta

BUSCO was run in mode: genome

    ***** Results: *****

    C:96.2%[S:96.0%,D:0.2%],F:1.0%,M:2.8%,n:8338
    8024    Complete BUSCOs (C)
    8008    Complete and single-copy BUSCOs (S)
    16      Complete and duplicated BUSCOs (D)
    86      Fragmented BUSCOs (F)
    228     Missing BUSCOs (M)
    8338    Total BUSCO groups searched

I did both pipeline and add manually with bobtools add but got the same error.

and the git show --oneline -s: in insdc-pipeline: e197cb8 add try except around pool.terminate so BLAST errors can be printed (#3) in blobtools2 56ff620 improve interrupt catching when hosting

sp.busco.vertebrata_odb10.txt

sp.busco.aves_odb10.txt sp.busco.vertebrata_odb10 (2).txt sp.busco.aves_odb10 (2).txt

Thank you very much!!! AK

rjchallis commented 4 years ago

Thanks for all the info. This looks like a BUSCO reporting bug that is tripping up the BlobTools2 parser. Rather than store information about all the lineage datasets, blobtools add --busco reads the information from the first rows of the full summary table. In this case:

# BUSCO version is: 4.0.4 
# The lineage dataset is: vertebrata_odb10 (Creation date: 2019-11-20, number of species: 3354, number of BUSCOs: 67)

But the number of BUSCOs and number of species values have been swapped so all the percentages are calculated based on the wrong total number of BUSCOs. If you edit the file to swap these two values, the import should work as expected.

I can try to add some code to catch this for people using this BUSCO version ( and it looks like I'll need to test other versions to see the same bug occurs.

It would be a good idea for you to report this to the BUSCO team so they can fix it for a future version.

akaraw commented 4 years ago

Awsome. It works!! Thank you very much! Yes, I will mention to BUSCO group about the issue.

Thank you again, Kindregards, AK