Some characters are not encoded properly for some parts of the resulting PDF. For instance Umlaute or the "En Dash" look correct in some parts and in others not so. It looks like a mismatch between UTF-8 and ISO 8859-1.
What does it look like?
See screenshots for reference.
↑ Here En Dash works.
↑ Here it does not.
↑ Here Umlaut works.
↑ Here it does not.
How can this be reproduced?
The following script was used to call the newest version of threagile using docker.
Here is a somewhat minimal example model based on the stub model, where the described behaviour can be seen.
threagile_version: 1.0.0
title: Model Stub
date: 2020-03-31
author:
name: Johan Wölfe # Here is an Umlaut
homepage: www.example.com
business_criticality: important
tags_available:
- some-tag
data_assets:
Some Data Asset:
id: some-data
description: Some Description
usage: business
tags:
origin: Some Origin
owner: Some Owner
quantity: many
confidentiality: confidential
integrity: critical
availability: operational
justification_cia_rating: Some Justification
technical_assets:
Some Technical – Asset: # Here is an En Dash
id: some-component
description: Some – Description # Here is an En Dash
type: process
usage: business
used_as_client_by_human: false
out_of_scope: false
justification_out_of_scope:
size: component
technology: web-service-rest
tags:
- some-tag
internet: false
machine: virtual
encryption: none
owner: Some Owner
confidentiality: confidential
integrity: critical
availability: critical
justification_cia_rating: Some Justification
multi_tenant: false
redundant: false
custom_developed_parts: true
data_assets_processed:
- some-data
data_assets_stored:
data_formats_accepted:
- xml
communication_links:
trust_boundaries:
Some Trust Boundary:
id: some-network
description: Some Description
type: network-dedicated-hoster
tags:
technical_assets_inside:
- some-component
trust_boundaries_nested:
shared_runtimes:
Some Shared Runtime:
id: some-runtime
description: Some Description
tags:
technical_assets_running:
- some-component
individual_risk_categories:
Some Individual Risk Example:
id: something-strange
description: Some text describing the risk category...
impact: Some text describing the impact...
asvs: V0 - Something Strange
cheat_sheet: https://example.com
action: Some text describing the action...
mitigation: Some text describing the mitigation...
check: Check if XYZ...
function: business-side
stride: repudiation
detection_logic: Some text describing the detection logic...
risk_assessment: Some text describing the risk assessment...
false_positives: Some text describing the most common types of false positives...
model_failure_possible_reason: false
cwe: 693
risks_identified:
<b>Example Individual Risk</b> at <b>Some Technical Asset</b>:
severity: critical
exploitation_likelihood: likely
exploitation_impact: medium
data_breach_probability: probable
data_breach_technical_assets:
- some-component
most_relevant_data_asset:
most_relevant_technical_asset: some-component
most_relevant_communication_link:
most_relevant_trust_boundary:
most_relevant_shared_runtime:
What is the issue?
Some characters are not encoded properly for some parts of the resulting PDF. For instance Umlaute or the "En Dash" look correct in some parts and in others not so. It looks like a mismatch between UTF-8 and ISO 8859-1.
What does it look like?
See screenshots for reference.
How can this be reproduced?
The following script was used to call the newest version of threagile using docker.
Here is a somewhat minimal example model based on the stub model, where the described behaviour can be seen.