Applying filters to published content

IanMayo commented 11 months ago

Our users wish to be able to export custom sub-sets of the content. The wish to share data/records with partner companies, but only for countries where they share maintenance contracts. They do no wish to include data for countries outside those maintenance contracts since the competitor could bid for the maintenance on their own.

On occasion the joint contract may only be for some types of installation in the country, or even just some specific installations.

From experimenting with filtering earlier in the year, I learned that it is easier to mark specific content for exclusion rather than for inclusion.

I wish our users (in Oxygen) to be able to select links and tag those links with an audience parameter from a drop-down list of: Exclude A, Exclude B, Exclude C, and Exclude D. It should be Reasonably-easy to maintain the entries for the drop-down list within Oxygen.

The link selected above could be for a Country (from a region page), or an Installation from a category table page.

Then as part of the Publish phase the user would indicate if they wanted an exclusion block included. Oxygen would then only export content that had not been excluded for that company.

Hopefully with the above we will go with the flow for OxygenXML and use existing tools/processes, like those detailed here: https://www.oxygenxml.com/doc/versions/26.0/ug-editor/topics/dita-profiling-conditional-text.html

Note: this supercedes similar attempts to capture this issue in #43 and #6.

The outcome of this task will be guidance in how to provide/configure the audience values, and how to filter the output at the publishing stage.

frank-zimmermann commented 11 months ago

I created a new branch: https://github.com/DeepBlueCLtd/Fi3ldMan/tree/Applying-filters-to-published-content-%23105

There are a lot of possibilities of defining profiling options. @IanMayo , I went the opposite way to filter your desired data. Let me explain.

Instead of writing values to exclude something(Exclude A, Exclude B, I created 3 different *.ditaval files. (There was already one available, I removed it and changed the position of the files, moved it into folder ditaval)

The client can now select in the publishing process between

Company-A
Company-B
All

In each ditaval file, it can be described which values for which attributes are allowed and which not. Instead of setting up a list of which values shouldn't be used by excluding them, a very detailed list can be managed in this file.

All does not contain any rules, so by selecting this, everything will be created.

The interesting part of Oxygen is the Attributes and Condition sets. I added the 3 ditaval files into our project file.

In the Profiling Attributes section, you define the values a user can use, so in our example "Company-A and Company-B.

To specify what content should be published, you have multiple possibilities in the Filters section of the DITA scenario.

Because of creating ditaval files and giving them a name, these names are now available here. You can also refer to a separate ditaval file, or, directly write all values that should excluded. But I think it is better to deal with separate ditaval files instead every time changing the values for exclusion.

During my research, one more interesting thing to filter more in detail is maybe using a Subject Schema Map https://www.oxygenxml.com/doc/versions/26.0/ug-editor/topics/subject-scheme-map.html

IanMayo commented 11 months ago

Hello @frank-zimmermann - thanks for this.

I ran some experiments with this earlier in the year, and learned it is easier to mark content to be excluded for a particular company, since the business rule to be applied is typically "Company-A can see all data except for France". The current strategy uses logic "Company-A can see all data except that tagged for Company-B". The real logic can't typically be implemented in that model since it would include so many special cases.

For a reviewer to verify the filtering has been implemented correctly, being able to view a list of "Show me all elements tagged with Exclude-Company-A" would meet their verification requirement. (I guess we'd use a bit of python/xsl to collate this list, when required)

Aah, one last thing. Is it possible to declare the values of audience in a data-file. Declaring them in Oxygen is a fragility/weakness I'd like to avoid if possible (since they may get lost on an Oxygen update).

frank-zimmermann commented 11 months ago

Hi @IanMayo , thanks for the feedback. This is the complete opposite of behavior as I have known it for years. From my understanding, I declare something as what it is and not as please ignore it for XYZ. Don't get me wrong, you and the client need to be clear on how to handle the content. I only want to let you know my thoughts. :-) I will find out if you can define the values separately in a file.

So in general, the functionality to include/exclude content is clear and no more time is needed here, correct? (without the last sentence)

IanMayo commented 11 months ago

complete opposite of behaviour

Yes, and it's the opposite of what I initially implemented. But, I learned that the include approach requires hundreds of pages of content to be tagged with multiple audience values. The color-coded audience values feature makes it difficult to view/review the filtering to be applied under this strategy - since the color coding would appear in so many places.

The exclude approach typically just requires changes to one or two pages - with the color-coded audience values only present in a handful of places. If there are 4 sister companies, each can have their own color code. The author can learn which color is which company exclusion, and as they work with the content they will be reviewing these exclude blocks when they see them.

Update: I see this page appears to show how we can put color-coded filter instructions in a DITAVAL file. We may learn that having the content in the DITAVAL brings problems, but initially I like the config data being stored out of the OxygenXML xpr file.

frank-zimmermann commented 11 months ago

Fully understandable. Thanks for the explanation.

frank-zimmermann commented 11 months ago

BTW: The audience values are saved in the xpr project file. Everything you change in the configuration and save them project-specific is there. Is this enough or do you want to have these values somewhere else?

IanMayo commented 11 months ago

Thanks - where possible, I'd like us to keep content outside the .xpr please. Lots of changes happen to the .xpr and it can be difficult to resolve conflicts. Putting the content into separate files makes them more accessible to DITA-OT (for testing) and immune to .xpr conflict resolution issues.

frank-zimmermann commented 11 months ago

We can define allowed values of attributes in a so-called Subject Scheme Map file. https://www.oxygenxml.com/doc/versions/25.1/ug-editor/topics/subject-scheme-map.html#subject-scheme-map

I will check what else is possible here,

IanMayo commented 11 months ago

That looks like the right thing, but note: my authors will only ever work in WYSIWYG (author) mode. They'll have a heart-attack if I suggest they work in XML.

frank-zimmermann commented 11 months ago

In Author mode, it is also the same. It was only a presentation for us :-) BTW: Only working in the structured view is correct ;-)

IanMayo commented 7 months ago

Note: since we're adopting an exclude model instead of an include model, I've just seen a simple feature in the Edit DITA Scenario panel:

If we tag content that France should not see with audence="-france", then the above UI allows us to quickly indicate we want -france excluded from the export.

There will probably still be benefit in creating the Subject Scheme Map - particularly if it introduces the available terms in the Author mode. But, note: I've just configured this panel to remove all profiling attributes except for our exclude countries:

IanMayo commented 7 months ago

Note: the files shown below demonstrate how to specify export filters. They are more capable than the relatively dumb exclude filters provided directly in the dialog. But the dialog-based exclude filters are sufficient for the current capability.

IanMayo commented 5 days ago

Create subject scheme
Name collection key (countriesFilter)
Insert countries (-france, -spain)
Mark key as being used for audience
Remove platform, product, props, otherprops, rev
Open ditamap in the Dita Manager
Keep scheme open, associate with index.ditamap (using append child, currently edited file), go to attributes, select type as subjectScheme.
Close Oxygen, re-open
Go into Configure scenario and select the Exclude all elements and provide -france.
Configure oxygen to highlight audience exclusions (Prefs / Edit / Mode / Profiling-Conditional Text)
Configure content to exclude specific content

DeepBlueCLtd / Fi3ldMan

Applying filters to published content #105