tldr-pages / tldr-translation-pairs-gen

Generates a structured dataset in various formats derived from tldr-pages.
https://opus.nlpl.eu/tldr-pages/corpus/version/tldr-pages
MIT License
4 stars 3 forks source link

Remove square brackets in command description #21

Open SethFalco opened 11 months ago

SethFalco commented 11 months ago

Some commands have square brackets in the command description, which is a hint to what argument it's referring too. However, this information isn't helpful when pairing translations. This may be worth removing while building the corpus.

Example

pages/common/am.md

`am start -n {{com.android.settings/.Settings}}`

- Start an activity and pass [d]ata to it:

pages.de/common/am.md

- Starte eine Aktivität und übergib ihr Daten:

`am start -a {{android.intent.action.VIEW}} -d {{tel:123}}`

The resulting dataset then has:

<tu>
  <tuv xml:lang="de">
    <seg>Starte eine Aktivität und übergib ihr Daten </seg>
  </tuv>
  <tuv xml:lang="en">
    <seg>Start an activity and pass [d]ata to it </seg>
  </tuv>
</tu>