DeepBlueCLtd / LegacyMan

Legacy content for Field Service Manual
https://deepbluecltd.github.io/LegacyMan/index.html
Apache License 2.0
2 stars 0 forks source link

Country page being over-written with generic version #652

Closed IanMayo closed 6 months ago

IanMayo commented 6 months ago

I thought I'd spotted a pattern in our mock data yesterday, but today I've seen the pattern in the real data.

A stakeholder has reported that the flag is missing from a country. I've checked and re-checked the structure of the flag data, and it matches other flag data.

I've also inserted a debug line, just before the flag is added to the dita, and it's present.

But, after the publish is complete, there is no flag.

I'm pretty sure the version of the file produced by the country/category specific parsing is being overwritten by a generic parser.

Here is my comment regarding the same pattern of behaviour in the Spain file: https://github.com/DeepBlueCLtd/LegacyMan/pull/646#issuecomment-1946436381

Aah, I think I have a solution. In our generic file processor, we start with a quick check to establish that it's not a file that is subject to Special processing. We can skip the category files since they have a TD:7. Aah, no. Transducer files also have that. I guess that in phase 1, when we process the category files under the regions, we could add a flag to the dictionary saying "no_generic_processing".

We could check for another phrase. These country/category files all have a TD:7 containing "PROPULSION CHARACTERISTICS OF..." (though our mock data doesn't currently have this). That could be an easy fix.

Aah, I've just thought of an even simpler solutiuon. Then generic file processor should not run if the file has been already processed. Then, if "custom processor" runs that page first it won't get over-written, but if "generic processor" runs first it will get over-written. That should solve it :-)

Update: I've just been investigating this. We are tracking files_already_processed. So, if we handle category processor first, it should not get over-written by generic processor. Aah, ok. For category processor the file id stored in files_already_processed includes the path, but in generic processor it isn't:

Category processor: target/dita/Spain/Spain.dita
process_generic_file: Spain.dita

But, ensuring we're storing the same path may not resolve this. It's still possible for a file to be processed by the generic file processor first, which (I think) would prevent it being processed by the category processor.