nim-lang / Nim

Nim is a statically typed compiled systems programming language. It combines successful concepts from mature languages like Python, Ada and Modula. Its design focuses on efficiency, expressiveness, and elegance (in that order of priority).
https://nim-lang.org
Other
16.57k stars 1.47k forks source link

hoisting + memfiles = wierd #4881

Open scriptum opened 8 years ago

scriptum commented 8 years ago

I'm trying to understand how to use term-rewriting macros for performance optimization purposes. One interesting thing is hoisting for regular expressions.

Simple example:

import memfiles, re
proc main()=
  var c = 0
  var f = memfiles.open("in.txt")
  for line in f.lines():
    if line.contains(re"agggtaaa|tttaccct"):
      c += 1
  echo(c)
  f.close()
main()

In this example Regex object created for every line.

Hoisting could improve performance (idea taken from docs):

import memfiles, re

template optRe*{re(pattern, flags)}(pattern: string{lit}, flags: typed): Regex =
  var glRe {.global, gensym.} = re(pattern, flags)
  glRe

proc main()=
  var c = 0
  var f = memfiles.open("in.txt")
  for line in f.lines():
    if line.contains(re"agggtaaa|tttaccct"):
      c += 1
  echo(c)
  f.close()
main()

Now magic. Take a look at generated C code:

NIM_EXTERNC N_NOINLINE(void, hoisting_weirdInit000)(void) {
nimRegisterGlobalMarker(T1021622097_3);
nimRegisterGlobalMarker(T1021622097_5);
    asgnRefNoCycle((void**) (&glre_154025_1021622097), re_152079_2126175263(((NimStringDesc*) &T1021622097_4), 24));
    asgnRefNoCycle((void**) (&glre_154045_1021622097), re_152079_2126175263(((NimStringDesc*) &T1021622097_4), 24));
    main_154006_1021622097();
}

Global variable initialized twice!

But replace memfiles.open to system.open:

NIM_EXTERNC N_NOINLINE(void, hoisting_weirdInit000)(void) {
nimRegisterGlobalMarker(T1021622097_3);
    asgnRefNoCycle((void**) (&glre_154024_1021622097), re_152079_2126175263(((NimStringDesc*) &T1021622097_4), 24));
    main_154006_1021622097();
}

This looks good.

I guess this is because memfiles iterator is more complicated and probable this is not a bug. Is there any workaround?

Araq commented 8 years ago

I don't know of a workaround, but the codegen should not emit duplicate assignments.

scriptum commented 8 years ago

Same thing with lexim project. Seems that memfiles iterator due to two yields not compatible with macros. More precisely, iterator + several yields + macros = double substitution. Probably, iterators need more attention.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had recent activity. If you think it is still a valid issue, write a comment below; otherwise it will be closed. Thank you for your contributions.