Filename removed from href in anchor tag when importing HTML

GoogleCodeExporter commented 9 years ago

Sigil, on import, is removing the '../Text/Content.html' from the <a> Tag of 
the attatched file.

What steps will reproduce the problem?
1. Open Existing file and select attached html

What is the expected output? What do you see instead?

<a href="../Text/Content.html#TOC_Chapter7">Chapter 7</a> is expected.
<a href="#TOC_Chapter7">Chapter 7</a> is what I get.

What version of the product are you using? On what operating system?

Sigil Version 4.903

OS

Using windows XP sp3

Original issue reported on code.google.com by shaun.voysey@outlook.com on 2 Jan 2012 at 7:05

Attachments:

toc.html

GoogleCodeExporter commented 9 years ago

I need to see the full epub to figure out what is happening. You can add the 
Private tag if you don't want the file publicly accessible. Only the reporter, 
myself, one other contributor would have access to the file with the Private 
tag set.

Original comment by john@nachtimwald.com on 2 Jan 2012 at 7:26

GoogleCodeExporter commented 9 years ago

Issue 1157 has been merged into this issue.

Original comment by john@nachtimwald.com on 13 Jan 2012 at 1:34

GoogleCodeExporter commented 9 years ago

Example file in issue #1157

Original comment by john@nachtimwald.com on 13 Jan 2012 at 1:35

GoogleCodeExporter commented 9 years ago

Issue 1175 has been merged into this issue.

Original comment by john@nachtimwald.com on 13 Jan 2012 at 1:35

GoogleCodeExporter commented 9 years ago

Issue 1322 has been merged into this issue.

Original comment by meme90...@gmail.com on 21 Mar 2012 at 5:17

GoogleCodeExporter commented 9 years ago

Original comment by meme90...@gmail.com on 21 Mar 2012 at 5:17

GoogleCodeExporter commented 9 years ago

Patch attached.

It may be that the filename was stripped for a reason...

Original comment by meme90...@gmail.com on 22 Mar 2012 at 6:08

Changed title: Removal of filename from href in anchor tag when importing HTML
Changed state: Started
Added labels: Milestone-0.6.0

Attachments:

Sigil-patch-issue1175-filename-stripped-href.txt

GoogleCodeExporter commented 9 years ago

Nice patch :) I would expect at least some '+' ;) 
(just joking, in good mood right now)

Does anybody know why this feature was implemented in first place? If feature 
exists it means somebody somewhere thought it's good idea to have it.

Maybe solution should be:
- check if filename is referencing self
- if so - remove filename from URL, leave only part after '#'
- else - leave URL intact

Original comment by standa31...@gmail.com on 23 Mar 2012 at 7:37

GoogleCodeExporter commented 9 years ago

It was stripped when you use Open to open an html file (something that probably 
will be removed) and wasn't meant for Import html.  Removing the filename from 
a self-referenced link is more for a cleanup routine and in fact is the 
opposite of the issue reported.  If its an actual problem for someone, then a 
new issue can be raised.

Original comment by meme90...@gmail.com on 23 Mar 2012 at 7:47

GoogleCodeExporter commented 9 years ago

Trouble is,  you need to import before you can open...  Hence the reporting.

There are quite a few of us that build our html externally, and with separate 
files.  Note,  Content, Table of Contents.  So the external files will need the 
file-names to reference the Linkages.

Admittedly, most of the older e-readers do not have touch screens,  so the 
links were moot.  But a lot of people read from Desktop machines ad Tablets,  
and these links can be vital.

Original comment by shaun.voysey@outlook.com on 23 Mar 2012 at 11:51

GoogleCodeExporter commented 9 years ago

Open refers to using File->Open and selecting an html file (not an epub file) 
to open.  This isn't apparently very common and will probably be removed as 
part of fixing this.

Original comment by meme90...@gmail.com on 23 Mar 2012 at 5:18

GoogleCodeExporter commented 9 years ago

I reported id=1322 that was merged into this issue.

What I was trying to achieve is to create mobi file out of documentation for 
some GNU library. What I did was:
- create New book
- Right MB - "Book browser-Text" and select "Add Existing files..."
- selected HTML files from documentation.
- files were added to "my book", but links became broken

IMO this was correct procedure to create new book and propagate it with 
existing files.

Original comment by standa31...@gmail.com on 25 Mar 2012 at 7:43

GoogleCodeExporter commented 9 years ago

I'm unable to make any tests by myself. When installing sigil from source codes:
cmake -G "Unix Makefiles" -DCMAKE_INSTALL_PREFIX=/opt/sigil_work 
-DCMAKE_BUILD_TYPE=Release .
make
sudo make install

I got:
I/O error : No such file or directory
/usr/share/mime/text/html.xml:1: parser error : Extra content at the end of the 
document

Original comment by standa31...@gmail.com on 31 Mar 2012 at 5:48

GoogleCodeExporter commented 9 years ago

The patch was uploaded, but not integrated/fixed in the source - so even if you 
built the source you wouldn't see it.  Need to review it as it might have an 
adverse impact on opening html files.

The inability to build is because of zlib1.6 in the source tree on linux.  A 
known issue.  You can replace it with 1.5 files from an older version and it 
should build ok.

Original comment by meme90...@gmail.com on 4 Apr 2012 at 11:15

GoogleCodeExporter commented 9 years ago

Need to review later.

Original comment by meme90...@gmail.com on 25 Apr 2012 at 7:41

Removed labels: Milestone-0.6.0

GoogleCodeExporter commented 9 years ago

Original comment by daveheil...@gmail.com on 19 May 2012 at 3:20

GoogleCodeExporter commented 9 years ago

Submitted fix.  

The original stripping of filenames appears to have been an attempt to cleanup 
links that might have been hard-coded to a full filesystem path in the html, 
etc.  But even if the links don't work now in some cases, a find replace can 
more easily fix them since the filename remains.  Note the original file on 
this issue has other problems not related to this.

Original comment by daveheil...@gmail.com on 1 Sep 2012 at 6:58

Changed title: Filename removed from href in anchor tag when importing HTML
Changed state: Fixed
Added labels: Milestone-0.6.0

Rainie3535 / sigil

Filename removed from href in anchor tag when importing HTML #1155