Open nuttyartist opened 1 year ago
When it comes to performance tests there are certain things that play into results, for example:
So currently it is difficult to know the exact reasons for your results.
Besides that maddy's regex way of doing things might slow down currently processing Markdown. In version 2 I plan to remove the usage of regex and go with another approach which hopefully will speed maddy up. (Which I - of course - will benchmark) But until then maddy might not be the fastest solution.
I'm working every now and then on version 2, but cannot commit yet to a release date due to RL and maddy being a side-project.
Of course - if somebody finds a way to speed things up a little in the meantime - I'm always happy for contributions.
Excuse my late reply. Here's a reproducible test with the first chapter of Moby Dick in Markdown: https://gist.github.com/nuttyartist/cb0053ccda823ac98a7ce58f296269cc
I got somewhat consistent results of the following: During Debug mode:
Maddy took 84380 milliseconds
MD4C took 0 milliseconds
During Release mode:
Maddy took 17552 milliseconds
MD4C took 0 milliseconds
EDIT: I edited the title after realizing Qt is using MD4C underneath.
I ran into the performance-issue too and for me that almost makes maddy unusable. After some profiling and testing I found that the culprits are the following parsers:
EMPHASIZED_PARSER ITALIC_PARSER STRIKETHROUGH_PARSER STRONG_PARSER
What they have in common is a long regexp that seems to take long to evaluate. I don't know if this breaks anything, but I replaced them with the following loops:
EmphasizedParser
void
Parse(std::string& line) override
{
std::string pattern = "_";
std::string newPattern = "em";
for (;;) {
int patlen = pattern.size();
auto pos1 = line.find(pattern);
if (pos1 == std::string::npos) {
break;
}
auto pos2 = line.find(pattern, pos1 + patlen);
if (pos2 == std::string::npos) {
break;
}
std::string word = line.substr(pos1 + patlen, pos2 - pos1 - patlen);
line = line.replace(pos1, (patlen + pos2) - pos1, "<" + newPattern + ">" + word + "</" + newPattern + ">");
}
}
ItalicParser
void
Parse(std::string& line) override
{
std::string pattern = "*";
std::string newPattern = "i";
for (;;) {
int patlen = pattern.size();
auto pos1 = line.find(pattern);
if (pos1 == std::string::npos) {
break;
}
auto pos2 = line.find(pattern, pos1 + patlen);
if (pos2 == std::string::npos) {
break;
}
std::string word = line.substr(pos1 + patlen, pos2 - pos1 - patlen);
line = line.replace(pos1, (patlen + pos2) - pos1, "<" + newPattern + ">" + word + "</" + newPattern + ">");
}
}
StrikeThroughParser
void
Parse(std::string& line) override
{
std::string pattern = "~~";
std::string newPattern = "s";
for (;;) {
int patlen = pattern.size();
auto pos1 = line.find(pattern);
if (pos1 == std::string::npos) {
break;
}
auto pos2 = line.find(pattern, pos1 + patlen);
if (pos2 == std::string::npos) {
break;
}
std::string word = line.substr(pos1 + patlen, pos2 - pos1 - patlen);
line = line.replace(pos1, (patlen + pos2) - pos1, "<" + newPattern + ">" + word + "</" + newPattern + ">");
}
}
StrongParser
void
Parse(std::string& line) override
{
std::string pattern = "**";
std::string newPattern = "strong";
for (;;) {
int patlen = pattern.size();
auto pos1 = line.find(pattern);
if (pos1 == std::string::npos) {
break;
}
auto pos2 = line.find(pattern, pos1 + patlen);
if (pos2 == std::string::npos) {
break;
}
std::string word = line.substr(pos1 + patlen, pos2 - pos1 - patlen);
line = line.replace(pos1, (patlen + pos2) - pos1, "<" + newPattern + ">" + word + "</" + newPattern + ">");
}
pattern = "__";
for (;;) {
int patlen = pattern.size();
auto pos1 = line.find(pattern);
if (pos1 == std::string::npos) {
break;
}
auto pos2 = line.find(pattern, pos1 + patlen);
if (pos2 == std::string::npos) {
break;
}
std::string word = line.substr(pos1 + patlen, pos2 - pos1 - patlen);
line = line.replace(pos1, (patlen + pos2) - pos1, "<" + newPattern + ">" + word + "</" + newPattern + ">");
}
}
I didn't measure how much faster this is, but my application went from being very laggy when parsing markdown-files to no lag that I can notice at all.
This is just a quick fix and I don't have time at the moment to clean it up and test it more, otherwise I would make a pull request. Just sharing it hoping that it is useful.
Hello! Thanks for this library. I was wondering why for the same text I got such a difference performance:
Maddy took 5304 milliseconds Qt took 5 milliseconds
Maddy code:
Qt code:
EDIT: By mistake I set it as a feature request.