veg / hyphy

HyPhy: Hypothesis testing using Phylogenies
http://www.hyphy.org
Other
200 stars 68 forks source link

Issue in GARD generated nex file #1675

Closed Daniyalprime closed 3 months ago

Daniyalprime commented 6 months ago

I tried to run MEME with the nex file generated by GARD (Datamonkey web server), but I got an error message saying "An unexpected error occured when parsing the sequence alignment!". I don't know what is causing this problem. I have attached the nex file for your reference. Any help would be appreciated.

screened_data.nex.txt

spond commented 6 months ago

Dear @Daniyalprime,

Are you receiving this error through Datamonkey?

Best, Sergei

Daniyalprime commented 6 months ago

Yes, I am trying to use MEME in datamonkey web server. @spond

spond commented 6 months ago

Dear @Daniyalprime,

Unfortunately, Datamonkey doesn't report the specifics of data upload errors. In your case, there were 3' stop codons in the alignment:

hyphy meme --alignment /Users/sergei/Downloads/screened_data.nex.txt 
...

>code => Universal
*** PROBLEM WITH SEQUENCE ' MK448824_1_CDS_0063_ENDOLYSIN' (1530 nt long, stop codons shown in capital letters)

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ttgaaagctaatgaatattttattgatgtgtcagcttatcaaccagctgatttaggctatacaacacaagcttcgggaacatcaaacacaattatcaaagtaacagagggaactggctgggtatcaccatataaagatgcccaagctagcacttcaaatcctattgggtattatcacttcgcacgttttggcggtaatgtaaatcaagcaatcgcagaagctacacattgccttaacaatgttggaaataaaaaagttaactacattgtatgtgattatgaagatagtgccagtgctaacatgcaagctaatacaaatgctattatagcttttatggatacgtgtaaatcttatggctatcaacctatttactattcttataagccttacacactagctaacgtagcttatcaacaaatactaactaaatatccagatagtttatggatggctgcatatccaagctataatgtcaccccagatcctgtttggtcaatttttccagctcttgacgggataagatggtggcaattcacatcaactggcattagtggtggacttgataaaaatgtagttttaatt------gatgataatgtgcatacagcaccaaaaaca------acagtaaaaattgaa---------gattatacggaggat------------------------------------------------------------------------------------------gaagatatgtttgtatttacagcaaatagtgatttaggttcagtc------aataacgactttaaaaaatttggaaaaggagcgacattcctaatcatt------------cca------------tcacgagcaaaa---attcgttatgttggtacatcagatgaacgcaactggttattaaaaactaagaaaatagaagaatttaaaggtggaccagcggatatgtttagatgcttaaca---------------------------------------ctttttgat------------------------ttaaaatttTAA

I went ahead and just masked them with --- (attached). MEME results are here: https://www.datamonkey.org/meme/65956336ba6f2072cc42574f

Best, Sergei

screened_data.nex.txt

Daniyalprime commented 6 months ago

Thank you very much. @spond Unfortunately, I tried to run MEME on the datamonkey server with the updated file you gave. But I facing this error. " **Error:

Master node received an error:<'constant' operation 'X'>, where 'X' is not a number. constant = 0 'X' = null While computing: meme.site_beta_nuisance*null

Function call stack 1 : Tree meme.site_tree_fel = ((MK448953_1_CDS_0061_ENDOLYSIN,((((((((((((MK448824_1_CDS_0063_ENDOLYSIN,MK448712_1_CDS_0028_ENDOLYSIN)NODE13,DQ386162_1_CDS_0019_ENDOLYSIN)NODE12,MT430910_1_CDS_0024_ENDOLYSIN)NODE11,KR153145_1_CDS_0019_ENDOLYSIN)NODE10,MH892358_1_CDS_0022_ENDOLYSIN)NODE9,MH937482_1_CDS_0021_ENDOLYSIN)NODE8,NC_070693_1_CDS_0022_ENDOLYSIN)NODE7,NC_070664_1_CDS_0022_ENDOLYSIN)NODE6,NC_071075_1_CDS_0024_ENDOLYSIN)NODE5,MK448933_1_CDS_0072_ENDOLYSIN)NODE4,((MK448837_1_CDS_0053_ENDOLYSIN,((MH853355_1_CDS_0038_ENDOLYSIN,MH853358_1_CDS_0038_ENDOLYSIN,MK448956_1_CDS_0062_ENDOLYSIN,MH853356_2_CDS_0024_ENDOLYSIN)NODE29,OQ633473_1_CDS_0060_ENDOLYSIN)NODE28)NODE26,JX409895_1_CDS_0041_ENDOLYSIN)NODE25)NODE3,(((JX409894_1_CDS_0017_ENDOLYSIN,MK448796_1_CDS_0080_ENDOLYSIN)NODE40,(MK448669_1_CDS_0054_ENDOLYSIN,MK448971_1_CDS_0065_ENDOLYSIN)NODE43)NODE39,MK448851_1_CDS_0065_ENDOLYSIN)NODE38)NODE2,MK448775_1_CDS_0061_ENDOLYSIN,MK448766_1_CDS_0050_ENDOLYSIN,MK448696_1_CDS_0052_ENDOLYSIN,MK448952_1_CDS_0050_ENDOLYSIN,MK448961_1_CDS_0048_ENDOLYSIN));

Keyword arguments:
    {
     "tree":"/home/datamonkey/datamonkey-js-server/production/app/meme/output/65956bc0ba6f2072cc4257e8.tre"
    }

2 : ExecuteCommands("UseModel ("+model.ApplyModelToTree.modelID[terms.id]+"); Tree id = "+tree["string"]+"; ", /home/datamonkey/datamonkey-js-server/production/app/meme/../../.hyphy/res/TemplateBatchFiles/libv3/models/);

Keyword arguments:
    {
     "tree":"/home/datamonkey/datamonkey-js-server/production/app/meme/output/65956bc0ba6f2072cc4257e8.tre"
    }

3 : model.ApplyModelToTree("meme.site_tree_fel",meme.trees[meme.partition_index],{terms.default:meme.site.background_fel},None);

Keyword arguments:
    {
     "tree":"/home/datamonkey/datamonkey-js-server/production/app/meme/output/65956bc0ba6f2072cc4257e8.tre"
    }

-------** "

spond commented 6 months ago

Dear @Daniyalprime,

Indeed! I noticed that just now. Let me look into the issue a bit more and report what I find.

Best, Sergei

Daniyalprime commented 6 months ago

Dear @spond, The issue still persists. Can you please provide guidance on how to overcome this issue? I appreciate your assistance and cooperation in this matter.

spond commented 6 months ago

Dear @Daniyalprime,

The issue you were encountering was due to an internal HyPhy / MEME bug. It'll be fixed in the next release. Thank you for reporting it and helping improve our tools.

In the meantime, please view the MEME results for your data here:

https://observablehq.com/@spond/meme?url=https://dl.dropbox.com/s/b34ha9st4yx8g5w/screened_data.nex.txt.MEME.json.gz

Best, Sergei

Daniyalprime commented 6 months ago

Dear @spond, Thank you for your response. I need to run MEME analysis on some other nex files that were created by GARD. Is there a workaround that you can suggest for this task?

spond commented 6 months ago

Dear @Daniyalprime,

The error will only occur in some multi-partition datasets. Please try running your other data on Datamonkey and see what happens. Please let me know if you see other datasets error.

Best, Sergei

Daniyalprime commented 6 months ago

Dear @spond, I attempted to run MEME on the remaining 11 nex files produced by GARD. However, I encountered the same problem in every case. I attempted to execute the sample file (pol.nex) as well, but I encountered the same issue.

spond commented 6 months ago

Dear @Daniyalprime,

I'll prioritize pushing out a fix over the weekend. I'll let you know when this is done.

Best, Sergei

github-actions[bot] commented 4 months ago

Stale issue message