Closed martinratinaud closed 2 years ago
This bears some similarities to https://github.com/ambanum/OpenTermsArchive/issues/752.
The first step should be to check for updates or pending issues in @accordproject, and to open an issue there if none match.
(As a side note, for the label question, I'm not sure to see the benefit in adding a specific label for this at the moment, the number of issues is manageable as it is and grouping them by technical component will unfortunately not make it easier to solve IMO 😅)
Here is what I've done so far.
TLDR: It is a problem when converting from pdf to HTML (and not from HTML to MD)
We have to consider the following also
My opinion is that we let ourselves 2 days to see if the issues I created have an answer, if not, we try another html to pdf and see if it works.
@MattiSG @Ndpnt if you have any other idea, please shoot
Thanks @martinratinaud for your investigation, recap and opening of issues to dependents! Let's hope @accordproject has a quick fix for this 🙂 🤞
Fix has been made and it is working correctly 🎉
I need to update the tests though as output will be slightly different for all pdf files now
While watching at changes received by emails, I came across this particular pdf file present in
france
(See https://github.com/OpenTermsArchive/france-declarations/blob/main/declarations/Decathlon.json )https://www.decathlon.fr/static/2019/LP/services/global-services/V21/assets/cgv.pdf
It generates a version without white spaces in some places https://github.com/OpenTermsArchive/france-versions/commit/91b6c1f
@MattiSG -not sure what kind of label I should use though, maybe
parsing
?