GrammaticalFramework / gf-core

Grammatical Framework core: compiler, shell & runtimes
https://www.grammaticalframework.org
Other
129 stars 35 forks source link

Improve performance with long extend-lists #147

Closed anka-213 closed 1 year ago

anka-213 commented 1 year ago

When a long list of explicitly included/excluded constants are imported, the performance is O(n * m), where n is the number of excluded or included constants and m is the total number of constants in the imported module.

This patch uses a Data.Set to speed up the check to O(log(n) * m).

For example, this file would take several minutes to load, now it takes less than a second:

abstract FromWordNet = WordNet [
a_couple_Card,
a_la_carte_Adv,
a_la_mode_Adv,
a_little_Card,
a_lotPl_Card,
a_lotSg_Card,
-- 20 000 lines of words
zone_in_on_V2,
zone_out_V,
zonk_out_1_V,
zonk_out_2_V,
zoom_in_V,
zoom_in_on_V2,
zoom_out_V,
zoophobia_N ] ;
inariksit commented 1 year ago

Thanks for the work, good catch!

Funny that this 1/5th of WordNet takes longer to load than just using the full WordNet! :-P

(Well, also the reason of using this subset is that not all terms in the full WordNet have a linearisation in all languages, so using this subset prevents linearisations like "the [cat_8_N] sat on the [mat_6_N]".)