Closed GoogleCodeExporter closed 9 years ago
Hi,
By repeated trying to open his epub and then exiting completely out of Sigil
and trying again, I was able to recreate the problem (mainly missing first
graphic or showing alt text of that graphic) and 3 times I was able to get it
to segfault.
I have zipped up all 3 Apple crash report logs and posted them here for you
just in case they help.
Original comment by kevin.b.hendricks@icloud.com
on 29 Oct 2010 at 6:36
Attachments:
Hi,
I tried opening the test epub again, and when I noticed that the first graphic
was missing, I asked Sigil to save the file as an epub and no crash occurred.
Interestingly, it hacked up the generated xhtml in a funny way, overwriting the
the link href of the css file with the name of the first graphic!
So there appears to be some corruption going on.
Here is a snippet of that malformed xhtml file:
<?xml version="1.0" encoding="utf-8" standalone="no"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<title>Arnico Grow chapter 01</title>
<link href="../Images/chapter-01_fmt.jpeg" rel="stylesheet" type="text/css" />
</head>
<body>
<div class="story" id="arnico-grow-chapter-01">
<p class="first-paragraph"><span class="generated-style">2006</span></p>
<p class="first-paragraph"><img alt="chapter-01.tif" src="../Images/chapter-01_fmt.jpeg" /><span class="generated-style">It was a bright ...
Hope this helps,
Kevin
Original comment by kevin.b.hendricks@icloud.com
on 29 Oct 2010 at 7:00
Hi,
It almost looks like something is changing the m_HTMLUpdates hash table while
this is running!
Original comment by kevin.b.hendricks@icloud.com
on 29 Oct 2010 at 7:45
Hi Kevin.
Thanks for helping me with this. I'm sure there are things you'd rather be
doing...
In regards to the missing image, are we looking at the same thing? I don't
have a missing image. Enclosed is a screen capture. Or perhaps i'm looking at
the wrong reference?
However, Sigil, for whatever reason, is not crashing anymore. I haven't
changed a thing. But since it's stabilized I've been able to explore the
whiting out of the display text. As stated in my last post I experimented with
the CSS/Styles and removed the specific font and replaced it with a generic
"serif" attribute. Which made the regular text appear in black, but oddly
enough caused the text that was black, the italics, (example: <span
class="italic">“Heredia,”</span>) to disappear (turn to white). I've
experimented with specifying black in the italic/span attributes but it does
not respond. So weird. This was not an issue with the earlier version of Sigil
I was using. And to have one issue disappear, then have new problems appear is
equally weird. I hope you have an idea because I'm out of them.
Again, thanks for your time. Your enthusiasm is appreciated, and your patience
with dealing with someone without coding chops is even moreso appreciated.
Erik
current css:
@font-face {
font-family: serif;
font-style: normal;
font-weight: normal;
}
div.generated-style {
}
div.generated-style-2 {
}
p.first-paragraph {
font-family: serif;
line-height: 1.27em;
font-size: 0.92em;
margin-bottom: 0.00em;
margin-top: 9.16em;
text-indent: 0.00em;
margin-right: 0.00em;
margin-left: 0.00em;
text-align: justify;
font-weight: normal;
font-style: normal;
color: rgb(0,0,0);
}
p.paragraph-text {
font-family: serif;
line-height: 1.27em;
font-size: 0.92em;
margin-bottom: 0.00em;
margin-top: 0.00em;
text-indent: 1.36em;
margin-right: 0.00em;
margin-left: 0.00em;
text-align: justify;
font-weight: normal;
font-style: normal;
color: rgb(0,0,0);
}
span.generated-style {
}
span.italic {
font-weight: normal;
font-style: italic;
}
Original comment by gordieor...@gmail.com
on 29 Oct 2010 at 8:30
Attachments:
Hi Valloric,
FWIW,
The following change (synchronizing things) makes all of the problems and
corruption disappear.
void PerformHTMLUpdates::UpdateHTMLReferences()
{
QList< xc::DOMElement* > nodes = XhtmlDoc::GetTagMatchingDescendants( *m_Document->getDocumentElement(), PATH_TAGS );
int node_count = nodes.count();
QFutureSynchronizer< void > sync;
for ( int i = 0; i < node_count; ++i )
{
sync.addFuture( QtConcurrent::run( this, &PerformHTMLUpdates::UpdateReferenceInNode, nodes.at( i ) ) );
sync.waitForFinished();
}
// We wait until all the nodes are updated
// sync.waitForFinished();
}
So something in UpdateReferenceinNode must not be safe to use by multiple
threads.
I looks and all of the nodes passed in are unique addresses in the DOM tree
just as you said they would be. I can't see where/how m_HTMLUpdates can be
changed in that routine unless memory corruption occurs. Assuming no attribute
of a node is duplicated at another node (ie. that all attribute address are
unique if the node they come from is unique) perhaps the NodeMap or node
attributes themselves have some common storage mechanism in the DOM
implementation that keeps attributes from different nodes in a single structure
that gets reallocated when something is changed in one of its elements, that
would make it unsafe.
No idea here really. But my hack seems to make things stable for me and I am
including it in my own build until an official solution from you comes
available.
Sorry I could not be more help.
Take care,
KevinH
Original comment by kevin.b.hendricks@icloud.com
on 30 Oct 2010 at 8:48
Hi Valloric,
FWIW, I found the following at the Xerces Apache.org website FAQ:
---cut-here---
FAQ: IS XERCES-C++ THREAD-SAFE
This is not a question that has a simple yes/no answer. Here are the rules for
using Xerces-C++ in a multi-threaded environment:
Within an address space, an instance of the parser may be used without
restriction from a single thread, or an instance of the parser can be accessed
from multiple threads, provided the application guarantees that only one thread
has entered a method of the parser at any one time.
When two or more parser instances exist in a process, the instances can be used
concurrently, without external synchronization. That is, in an application
containing two parsers and two threads, one parser can be running within the
first thread concurrently with the second parser running within the second
thread.
The same rules apply to Xerces-C++ DOM documents. Multiple document instances
may be concurrently accessed from different threads, but any given document
instance can only be accessed by one thread at a time.
DOMStrings allow multiple concurrent readers. All DOMString const methods are
thread safe, and can be concurrently entered by multiple threads. Non-const
DOMString methods, such as appendData(), are not thread safe and the
application must guarantee that no other methods (including const methods) are
executed concurrently with them.
The application also needs to guarantee that only one thread has entered either
the method XMLPlatformUtils::Initialize() or the method
XMLPlatformUtils::Terminate() at any one time.
---cut-here---
So if each xhtml file in an epub was a separate DOM document, then you could
parallelize them, but it seems from this faq answer, that you can not
concurrently run updates on the same DOM document from multiple threads.
I saw a post on another site that said, this is because xerces-c uses caches
and shared structures (maps) that cross nodes, but I have not looked closely at
that code to see if that was true or not.
Take care,
Kevin
Original comment by kevin.b.hendricks@icloud.com
on 31 Oct 2010 at 7:00
Hi Kevin,
Yes, this problem is caused by Xerces storing things in a non-thread-safe way
internally. I realised this was the problem after seeing the thread on
MobileRead that is related to this issue (
http://www.mobileread.com/forums/showthread.php?t=100530 ) and then stepping
through Xerces source.
QDom was internally thread-safe, so this worked there. But I'll have to
serialise write access to the Xerces-provided DOM. It won't really slow down
anything since from previous profiling sessions I know that parallelizing
UpdateReferenceInNode updates only provides about 5% perf increase.
Original comment by Strahinja.Markovic@gmail.com
on 31 Oct 2010 at 7:13
This issue was closed by revision 61d0b7612b.
Original comment by Strahinja.Markovic@gmail.com
on 31 Oct 2010 at 8:14
Kevin, even though I'm pretty sure this is now fixed, I'd appreciate it if you
could confirm this on your machine. I'm was unable to reproduce the original
problem on any machine, but you seemed to have had better (worse?) luck.
Original comment by Strahinja.Markovic@gmail.com
on 31 Oct 2010 at 8:17
Hi,
Grabbed the diff, applied it to the official 0.3.0 RC2 source release and I am
recompiling now.
I will run checks on it against the sample document and verify it works and let
you know.
Thanks!
Kevin
Original comment by kevin.b.hendricks@icloud.com
on 31 Oct 2010 at 8:35
Hi,
Just ran it and was able to successfully open the test file 20 times in a row
with no problems. Previously, I ran into problems after just 2 or 3 openings
of some sort.
Looks like you nailed it.
Thanks,
Kevin
Original comment by kevin.b.hendricks@icloud.com
on 31 Oct 2010 at 8:38
Thanks Kevin. :)
Original comment by Strahinja.Markovic@gmail.com
on 31 Oct 2010 at 8:41
Hi Kevin.
This all sounds good. And good job! It's interesting to watch you guys tackle
a problem. So do i wait for a new updated build? What do you recommend?
I haven't solved the disappearing (whited out) italic text but I guess I can
work with it as long as the program isn't crashing.
Original comment by gordieor...@gmail.com
on 2 Nov 2010 at 3:53
@erik A new release of Sigil with this fix is forthcoming.
Original comment by Strahinja.Markovic@gmail.com
on 2 Nov 2010 at 10:34
Original issue reported on code.google.com by
gordieor...@gmail.com
on 29 Oct 2010 at 5:49Attachments: