cfe-lab / phylodating

1 stars 1 forks source link

More informative error messages #3

Closed jwkai closed 2 years ago

jwkai commented 2 years ago

When the input files produce an error, this tool prints a traceback from the R script that can be confusing for users that are not experienced with software troubleshooting.

For example, a mismatched ID in the info.csv file,

ID,Date,Query
RNA1,2011-03-08,0
RNA2,2011-03-08,0
RNA3,2012-11-05,0
RNA4,2012-11-05,0
RNAA5,2012-11-05,0
RNA6,2015-05-07,0
RNA7,2015-05-07,0
DNA1,2018-09-05,1
DNA2,2018-09-05,1
DNA3,2018-09-05,1
DNA4,2018-09-05,1

and

(DNA1:0.000857,(RNA1:0.001053,((RNA5:0.019962,DNA3:0.001):0.040048,(RNA2:0.007126,((DNA2:0.005841,RNA3:0.000712):0.004969,(RNA4:0.008952,(DNA4:0.019875,(RNA6:0.003845,RNA7:0.01993):0.006931):0.030149):0.007087):0.010048):0.00697):0.001021):0.009876);

will produce the following very long error message (this is only an excerpt):

       root_and_regress: 
Attaching package: 'dplyr'

The following objects are masked from 'package:stats':

    filter, lag

The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

Attaching package: 'data.table'

The following objects are masked from 'package:dplyr':

    between, first, last

Error: Info file missing data
Execution halted

plot_divergence: Registered S3 method overwritten by 'treeio':
  method     from
  root.phylo ape 
ggtree v2.0.1  For help: https://yulab-smu.github.io/treedata-book/

If you use ggtree in published research, please cite the most appropriate paper(s):

- Guangchuang Yu, Tommy Tsan-Yuk Lam, Huachen Zhu, Yi Guan. Two methods for mapping and visualizing associated data on phylogeny using ggtree. Molecular Biology and Evolution 2018, 35(12):3041-3043. doi: 10.1093/molbev/msy194
- Guangchuang Yu, David Smith, Huachen Zhu, Yi Guan, Tommy Tsan-Yuk Lam. ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data. Methods in Ecology and Evolution 2017, 8(1):28-36, doi:10.1111/2041-210X.12628
...

It would be useful to print a more readable error message above the traceback, at least the relevant part:

Error: Info file missing data

and, if possible, something more helpful, such as

Error: Info.csv missing ID "RNA5"

When the tool fails, the traceback is printed as follows: https://github.com/cfe-lab/phylodating/blob/e396398e36e8d71fb74309c70a9f1fe62cbf8c72/templates/jobs/details.html#L15-L22

Getting more informative errors might involve editing the R scripts.

brj1 commented 2 years ago

Good idea. I have edited the R scripts so that they have better error messages. If the info file is missing an ID the error message displayed will now be:

Info file missing data (ID in tree file not found in info file: "RNA5")

I also suppressed the library loading messages so that the actual error messages will be more visible.

I have pushed these changes to GitHub, but I will have to ask the lab IT staff to migrate the changes onto the web server.

jwkai commented 2 years ago

Thanks @brj1 !

I am able to deploy this to the server as well. Although I'm currently setting up a test version of the server on our internal network, which the software team has decided is a good intermediate step between updating the GitHub and making the change "live". I should have that ready within the next few work days.

jwkai commented 2 years ago

Tested and deployed! Looks good.