IsAbsoluteUrl used the urls.ParseURL to determine if it had a protocol or not. The regex in that ParseURL function is quite expensive, especially when used 1000+ times in createTermDefinition with a schema.org Context - I used the logic defined in jsonld.js - by checking if it was an absolute url (via the go url package) or a blank node (starts with _:) - this passes all test and cuts the expansion of 100 schema.org json ld documents from 13 seconds to ~8 seconds.
the term regex that I extracted in PR #42 was called so often, that I looked into simplifying. This regex simply checks for the existence of about half a dozen characters on the suffix of the term. I simplified this into a switch statement and further dropped processing times of the same 100 documents from 8 seconds to ~5 seconds
really just using pprof in my application to hunt down hotspots. There are a few more, but getting into dimension return territory at this point
two more bottlenecks discovered:
IsAbsoluteUrl
used theurls.ParseURL
to determine if it had a protocol or not. The regex in that ParseURL function is quite expensive, especially when used 1000+ times increateTermDefinition
with a schema.org Context - I used the logic defined in jsonld.js - by checking if it was an absolute url (via the gourl
package) or a blank node (starts with_:
) - this passes all test and cuts the expansion of 100 schema.org json ld documents from 13 seconds to ~8 seconds.the term regex that I extracted in PR #42 was called so often, that I looked into simplifying. This regex simply checks for the existence of about half a dozen characters on the suffix of the term. I simplified this into a switch statement and further dropped processing times of the same 100 documents from 8 seconds to ~5 seconds
really just using
pprof
in my application to hunt down hotspots. There are a few more, but getting into dimension return territory at this point