dondi / GRNsight

Web app and service for modeling and visualizing gene regulatory networks.
http://dondi.github.io/GRNsight
BSD 3-Clause "New" or "Revised" License
17 stars 8 forks source link

Buggy file crashes local server in Branch Onariaginosa #916

Closed Onariaginosa closed 3 years ago

Onariaginosa commented 3 years ago

For some reason, when I try to upload a specific workbook in my local server, on my branch it crashes GRNsight with an error. I recreated the workbook and ran it again, and the recreated workbook works. Here is the buggy file: optimization-parameters-default-tester.xlsx Here is the duplicate file: optimization-parameters-default.xlsx

Here is the Error Message in node v14.15.5:
Screenshot from 2021-02-24 14-17-23

Here is the Error Message in node v8.4.0: Screenshot from 2021-02-24 14-26-02

dondi commented 3 years ago

The issue looks like it comes from a malformed workbook object—malformed in such a way that it cannot be converted to JSON, terminating the server. It would probably be good to get a more specific idea of what's wrong here by looking at the workbook object via console.log, before the code attempts to JSON-ify it.

dondi commented 3 years ago

@dondi can also examine the two files at the byte level to see if there is anything anomalous in the original (since it has happened before). #excelhiddentreasure

Onariaginosa commented 3 years ago

I looked at the workbook object and it is throwing errors that look like the following:

   {
      errorCode: 'INVALID_GENE_TYPE',
      possibleCause: "Gene 'undefined' in row 17, column A in the production_rates sheet is not a string.",
      suggestedFix: 'Please make your gene name a string starting with a letter.'
    },
       {
          errorCode: 'INVALID_GENE_TYPE',
          possibleCause: "Gene 'undefined' in row 17, column A in the production_rates sheet is not a string.",
          suggestedFix: 'Please make your gene name a string starting with a letter.'
        },
    {
      errorCode: 'ERRORS_OVERLOAD',
      possibleCause: 'This workbook has over 20 errors.',
      suggestedFix: 'Please check the format of your spreadsheet with the guidelines outlined on the Documentation page and try again. If you fix these errors and try to upload again, there may be further errors detected. As a general approach for fixing the errors, consider copying and pasting just your adjacency matrix into a fresh Excel Workbook and saving it.'
    },

And Errors Overload errors. It was interesting because all of the errors started on row 17 in the production_rates sheet, when the last gene was on row 16. It didn't look like there was any additional data anywhere in the sheet, so I copied the data A1:B16 into another sheet and renamed the old and new sheets to production_rates_hidden_errors and production_rates, respectively. When I ran it again, it worked properly. It seems that there is hidden additional data somewhere in the production_rates_hidden_errors sheet that is causing it to crash the server.

The fixed broken file with the renamed broken sheet is attached below optimization-parameters-default-tester.xlsx The full log of the workbook is attached below optimization-parameters-default-tester-workbook-object-broken.pdf

dondi commented 3 years ago

I was able to extract the files from these workbooks and for being “copies” of each other, they have a surprising number of differences. A comparison can be found here: https://github.com/dondi/GRNsight/commit/5a24f4af64a18729a68bddc4cb0f5647eb65984b

Due to the number of differences, I haven’t pinpointed yet which one might be the key issue (many differences are trivial, like which cell is highlighted). But maybe we can look at this together—multiple eyeballs might be able to spot something more quickly.

dondi commented 3 years ago

Since it will be hard to control the kinds of Excel excentricities that may show up, the action item here will be to add some error-handling code for graceful exit.

dondi commented 3 years ago

Issue #341 can be combined with this.

dondi commented 3 years ago

This will be addressed if #341 is fixed, so closing this one.