Docbuilder is slow - Githubissues

shakna-israel / docbuilder

Build Python Technical Documentation from Literate Programming Programs

http://docbuilder.rtfd.org

MIT License

1 stars 0 forks source link

Docbuilder is slow #53

Closed shakna-israel closed 9 years ago

shakna-israel commented 9 years ago

Docbuilder is too slow, currently.

This is mostly due to the way it iterates over the same line several times.

Some smarter loops should be able to speed it up considerably.

shakna-israel commented 9 years ago

readFile seems to be the function responsible for the crappy speed, currently.

Each line in the file being read needs to be read into memory, then checked to see if it's Markdown or code, and then sent off to the relevant function to be written to the output file.

Speeding this up isn't going to be simple, but is absolutely necessary with the terrible output speed it currently has.

Current average seems around 0.7 seconds per line, multiply that out to a 5,000 line file, a totally reasonable file size, and you end up with 58.33 minutes. An hour for a 5,000 line file.

shakna-israel commented 9 years ago

Speed Options:

Parsing Markdown and Code in blocks, like #28, might be able to reduce the amount of passing data between functions.
Loading the file into memory before trying to parse it might reduce the speed, and reduce the chance of a race condition.

shakna-israel commented 9 years ago

Delaying till 0.5

shakna-israel commented 9 years ago

The fileinput library is used to stream files and process them, so only one line is ever held in memory.

As only one line is ever being processed at once by readFile, this seems ideal, especially as fileinput is part of the stdlib.

shakna-israel commented 9 years ago

Only the first line of a file should be checked for a hash bang. See stringManage.

shakna-israel commented 9 years ago

Solved as of ce7bf9a81f2bb2ff9cf26f570c06193738d0e87d