BCLibCoop / nnels-a11y-publishing

GNU Lesser General Public License v3.0
5 stars 0 forks source link

Language tag needs to be moved form the <body> tag up to the root HTML element. #17

Open zwettemaan opened 5 years ago

zwettemaan commented 5 years ago

@LauraB7, when I look at one of the sample files, I see:

<!DOCTYPE html>
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<html xmlns="http://www.w3.org/1999/xhtml" 
  xmlns:epub="http://www.idpf.org/2007/ops">
...
    <body id="Something-1" lang="en-CA" xml:lang="en-CA">
...
    </body>
</html>

I would change this to

<!DOCTYPE html>
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<html xmlns="http://www.w3.org/1999/xhtml" 
  xmlns:epub="http://www.idpf.org/2007/ops" 
  lang="en-CA" xml:lang="en-CA">
...
    <body id="Something-1">
...
    </body>
</html>

Is that correct?

If not, can you show me how to adjust this example?

zwettemaan commented 5 years ago

I've released a new version of DropToScript which now accepts complete EPUB files for drag/drop, in addition to individual html files.

I've added a 'LangTag' script which finds the lang and xml:lang on the body tag and moves them to the html tag

zwettemaan commented 5 years ago

@LauraB7 Give it a try if you can. Also, @flittle8 , I've not yet tackled #16 (not working on El Capitan) - that's next on my list.

zwettemaan commented 5 years ago

As usual, latest version is here:

https://github.com/BCLibCoop/nnels-a11y-publishing/blob/master/ReleaseVersions

flittle8 commented 5 years ago

@zwettemaan can we also add language tags to the HTML element if they don't exist? is that a new github issue or should I add more details here?

zwettemaan commented 5 years ago

@flittle8 I pushed out a new release 1.0.6_1.0.10 which has a default setting for the lang/xml:lang attribute if none is found on the body tag. Tested with Sigil and your sample EPUBs and seems to work fine...

flittle8 commented 5 years ago

@zwettemaan Hi Kris, I tested this script on 3 files, and on all 3 it didn't work. I've attached the error messages that were generated for each. Let me know if you need further info. Using script LangTag-2.txt Using script LangTag-3.txt Using script LangTag.txt

flittle8 commented 5 years ago

@zwettemaan also noticed that the readme file for this script refers to AutoTitle...

"## What it does

AutoTitle.php is a command-line PHP script which can process an HTML file..."

flittle8 commented 5 years ago

@zwettemaan did you incorporate this into the script? https://github.com/BCLibCoop/nnels-a11y-publishing/issues/17#issuecomment-491240071_ If no language tags exist then we should grab the code from the OPF and put it into the html files

LauraB7 commented 5 years ago

I ran this script on the EPUB I just referred to in Issue #1, and it worked beautifully.

zwettemaan commented 5 years ago

@flittle8

Can you provide me with copies of those EPUBs? The error logs don't tell me much. I need to see the actual documents in order to diagnose and fix.

I've not implemented the OPF tag reading - not sure if there will be enough time left to add that. At present, you need to edit the config.txt file to 'inject' a desired language in case none is available.

I'll fix up the LangTag docs - copy paste error.

But if at all possible, please separate issues from one another.

Do not put unrelated problems into the discussion thread for an unrelated issue.

I have many balls in the air, and if problems I need to fix are not set up as separate issue entries, they will get lost.

I created an issue entry for the readme problem you found:

https://github.com/BCLibCoop/nnels-a11y-publishing/issues/20

flittle8 commented 5 years ago

@zwettemaan I reviewed the 3 EPUB files and I think they didn't work because the script currently can't handle this functionality of moving the language specified in the OPF to the elements of each page. I haven't found an EPUB with just language tags on the body yet, which I think is the only scenario that this script handles?

flittle8 commented 5 years ago

Laura says it's moving lang tags from body to html, so we can probably close this one off, unless you have time to add in that extra functionality to move OPF language to each HTML Page (replacing whatever is currently there).

zwettemaan commented 5 years ago

Hi Farrah, did you read the docs on the "defaultIfMissing" entry? That allows you to use the DropScript even if there is no lang on the body tag.

See default config:

https://github.com/BCLibCoop/nnels-a11y-publishing/blob/master/DropScripts/LangTag/LangTag.config.txt

and also search for 'defaultIfMissing' in

https://github.com/BCLibCoop/nnels-a11y-publishing/blob/master/DropScripts/LangTag/LangTag.ReadMe.md

LauraB7 commented 5 years ago

The language tag script worked well for me on InDesign exported content. The only place it messed up was HTML files without a header at the top. In many of my EPUBs, for example, the title page is a standalone file that just has the image of the title page type (with alt text, of course). As there is no <h1>, the script doesn't work on that single file.

flittle8 commented 5 years ago

@zwettemaan yes, I have my settings like this:

{
    "defaultIfMissing": "en",

    }

I'll email you the EPUB

zwettemaan commented 5 years ago

Hi @flittle8. What do you know. It seems to work fine for me. I'll fire up the old El Capitan with the old Sigil to see if that's the problem. On my machine with the latest Sigil, Sigil also shows the added language attributes.

Screen Shot 2019-05-24 at 8 37 47 AM Screen Shot 2019-05-24 at 8 38 18 AM
zwettemaan commented 5 years ago

The language tag script worked well for me on InDesign exported content. The only place it messed up was HTML files without a header at the top. In many of my EPUBs, for example, the title page is a standalone file that just has the image of the title page type (with alt text, of course). As there is no <h1>, the script doesn't work on that single file.

@LauraB7 I could enhance the Cleaner script (which currently only touches <!DOCTYPE> and <?xml...?>) to also check and insert an HTML header if it is missing. That way, you'd run Cleaner first (so all files have an html tag), then the AutoLang.

zwettemaan commented 5 years ago

Ok, it fails on El Capitan. Probably another issue with the outdated version of PHP...

zwettemaan commented 5 years ago

Hi @flittle8. Version 1.0.6_1.0.12 is up. It was effectively caused by the older version of PHP on El Capitan not being able to execute LangTag.php. Now it seems to work fine on El Capitan too.

LauraB7 commented 5 years ago

Well, @zwettemaan, in the case of that HTML file that was missed, I wouldn't actually want a header inserted in the file. There are occasions where it makes good sense to skip a top-level header. So I am not certain that's a good solution.

zwettemaan commented 5 years ago

It would be up to you to run or not run it? I.e. I am not proposing an automatic solution. All I am saying is: if you wanted the headers inserted I could make Cleaner (or a variant thereof) do it. But if you don't run it, it does not do it - it's up to the user to decide whether to use that Cleaner or not.

LauraB7 commented 5 years ago

That makes good sense.

flittle8 commented 5 years ago

@zwettemaan tested this out again on 3 files. it worked nicely on 2. however on 1 EPUB it seemed to mess with the header again, so that the content doesn't render... would you like me to email you that EPUB?

zwettemaan commented 5 years ago

Yes, please, I'll have a look.

flittle8 commented 5 years ago

Emailed it over

zwettemaan commented 5 years ago

@flittle8 Ok, I've put out an update which should work on all.