DEIB-GECO / GMQL

GMQL - GenoMetric Query Language
http://www.bioinformatics.deib.polimi.it/geco/
Apache License 2.0
18 stars 11 forks source link

MAP crash and GC errors #99

Closed Erlaad closed 6 years ago

Erlaad commented 6 years ago

A stopping exception is thrown if input dataset is too large when executing MAPs.

It usually takes the form of the java garbage collector throwing an exception, or the execution of step MAP7 failing too many times and causing a crash.

I attach two logs from the GMQL web interface which are the results of the problem. To reproduce this issue, please launch queries in the “query.zip” attached file, in order (Q1 thorugh Q4): Q4 should be the crashing point. Queries use only ENCODE data, so you should be set to go. Note that the problem does not always appear, as we think it is dependent on the current status of the Java internal heap. Similar results happen on the CINECA implementation.

Arif and Pietro pinpointed the issue to the MAP7 and/or GMAP4 implementation, and suspect that it might be due to how internal state variables are handled, in particular the use of VARs instead of VALs (you may further discuss details with them). queries.zip

error_20171218_2.txt error_20171218.txt

Erlaad commented 6 years ago

Tested the query again and they arrive to successful completion. While I cannot definitely say this problem will not happen with bigger datasets, I'd say the issue is fixed and tested.