DeepBlueCLtd / LegacyMan

Legacy content for Field Service Manual
https://deepbluecltd.github.io/LegacyMan/index.html
Apache License 2.0
2 stars 0 forks source link

Retains dimensions of flags from the original HTML in the output DITA #646

Closed robintw closed 6 months ago

robintw commented 6 months ago

Fixes #639.

IanMayo commented 6 months ago

Thanks Robin, I was just "kicking the tyres" a little, and moved the France1.1 flag to go immediately before the h2 title, instead of after it: image

The code fell over at this line, though it looks like country_flag isn't used anywhere, and can be deleted: https://github.com/DeepBlueCLtd/LegacyMan/pull/646/files#diff-4b3d0c7b7f7b179f5eae8ad9d87891455caf6f2e18e73b59cc9eb93a3bc4886aL246

File "/Users/ian/git/LegacyMan/parser/lman_parser.py", line 215, in process_sub_region 
       if does_image_links_table_exist(self.root_path / path):
File "/Users/ian/git/LegacyMan/parser/parser_utils.py", line 246, in does_image_links_table_exist
    country_flag = title.find_next("img")["src"]
TypeError: 'NoneType' object is not subscriptable

I removed that line, and then the parse fell over at:

  File "/Users/ian/git/LegacyMan/parser/lman_parser.py", line 219, in process_sub_region
    self.process_category_pages(
  File "/Users/ian/git/LegacyMan/parser/lman_parser.py", line 425, in process_category_pages
    flag_img["src"], flag_img.get("width"), flag_img.get("height")
TypeError: 'NoneType' object is not subscriptable

I can't verify this scenario as being present until PM on Monday, but it wouldn't surprise me at all if the flag didn't always come immediately after the h2. Where it doesn't happen I can probably fix them manually, but it would be easier for me if the parser logged a "missing flag" error to the console than if it fell over.

robintw commented 6 months ago

Hmm, that's interesting - that's a different use of the country_flag that I wasn't aware of, in a different function.

I suspect if it hadn't crashed in that function it would have crashed in one of the places that country_flag is used (and one of the places I recently updated it). I'll add some code to print an error if it can't find the flag rather than crashing.

IanMayo commented 6 months ago

that sounds great, thanks.

robintw commented 6 months ago

Fixed now, it gives an warning of:

WARNING:  Cannot find flag image in page ../France1/France1.1.html

And replaces the URL to the non-existent image with a filename of FlagImageNotFound.jpg, so we get a sensible error in the publish step where it can't find the image referred to.

IanMayo commented 6 months ago

We're correctly passing the height and width parameters.

But, when I go to check spain.dita it does not contain the flag data at all. I put in trace lines, and saw the flag data was collated, and the flag-section created, and present in the output to file. It appears that Spain.html is being processed twice: once as a category page (which picks up the flag data), and once as a generic file (which ignores the flag data).

I started to capture this as an unrelated issue, but am starting to wonder if it's related to the collection of flag data. Since the flag images link back to the country, could that mean that the country page gets added to the "shopping list", which makes it get processed as a generic file? (so over-writing when it got processed as a category page).