Open LAfricain opened 5 years ago
I will look into trying to fix this over the next few days.
That warning is added any time orefs is not able to process a reference. That's to make it easier to locate where references were not processed so that manual fixes can be applied afterwards.
That warning is added any time orefs is not able to process a reference.
Yes I know, but in this case it has not to be there, because it was able to do it.
It put the warning there because right now it thinks Ps.
is a separate reference. It sees 3 references. It's entirely expected.
Ok I understand.
This should be fixed now. Try it and let me know.
No it's not working:
<verse sID="Zech.12.10" osisID="Zech.12.10" n="10"/>Men över Davids hus och över Jerusalems invånare skall jag utgjuta en nådens och bönens ande, så att de se upp till mig, och se vem de hava stungit. Och de skola hålla dödsklagan efter honom, såsom man håller dödsklagan efter ende sonen, och skola bittert sörja honom, såsom man sörjer sin förstfödde.<note type="crossReference"><reference osisRef="Zech.6.26 Zech.39.29 Joel.2.28 Rev.8.10 Rev.19.37">Jer. 6,26. Hes. 39,29. Joel 2,28. Am. 8,10. Joh. 19,37.<!-- orefs - unprocessed reference --></reference>
It continue to read the "." only as separation between refs (only book without . are well build, see up Joel):
WARNING: Reference not processed… Jer
WARNING: Reference not processed… 52,8 f
WARNING: Reference not processed… Lam
WARNING: Reference not processed… Ps
WARNING: Reference not processed… Jer
WARNING: Reference not processed… 25,15, 21
WARNING: Reference not processed… Lam
WARNING: Reference not processed… Jes
WARNING: Reference not processed… Lam
WARNING: Reference not processed… 5 Mos
WARNING: Reference not processed… 28,30 f
WARNING: Reference not processed… Lam
WARNING: Reference not processed… 2 Mos
If you want to test https://gitlab.com/crosswire-bible-society/swe1917/tree/master/osis But don't notice the difference with some refs (the refs in the deutero are wrotten like that Mark. 3:17., and in the other book as i said already : Mark. 3,17.
I will look at the source so I can investigate this further. Note that inconsistency in the characters used to separate parts of the references is not something that orefs can handle.
Looking at the source, I can see why it's not working. There are far too many inconsistencies in how the references are written for orefs to reliably process them.
Note that inconsistency in the characters used to separate parts of the references is not something that orefs can handle.
Yes, I plan to standardize that. But it should at least treat the refs that they are well written. But that does not work either. Is this issue #68 linked?
I tested just with one file (then all the refs use the same form) it is the same problem. Orefs doens't recognize the ref that end with an ".".
orefs thought there was an additional reference since the . was used to separate multiple references. I just added an adjustment to ignore empty references so that shouldn't be marked any longer.
@adyeths
Are abbreviations of compound book names that contain more than one period catered for?
Example: (Latvian Bibles)
Dāv.dz.
for Dāvida dziesmu grāmata
(Ps.)
@DavidHaslam The only reason that the problem with the period was problematic here was because it was also used to separate multiple references. That has been corrected. There shouldn't be any issues with abbreviations that contain more than one period.
I'm sorry, I made a mistake posting on the wrong issue, this message was for this issue:
Even now, the problem subsist:
<reference osisRef="1Kgs.29.45 1Kgs.26.11">2 Mos. 29,45. 3 Mos. 26,11.<!-- orefs - unprocessed reference --></reference>
If it's not possible to manage it I can change all the refs?
There are just too many inconsistencies in how the references are written in the swe1917 text for orefs to reliably process them. And the fragments I'm seeing posted here just aren't enough for me to figure out if I can even address the problem in orefs. I will have to wait until the inconsistencies are corrected before I can proceed further with this. (And if they can't be corrected, then the references will have to be processed manually.)
There are just too many inconsistencies in how the references are written in the swe1917
The only inconsistence I see it is the chapter and verse separator. I can fix this. Do you see other inconsistences? You can already have a look in the osis file. But we are almost sure the problem is the period that ends the book name. And it's a very common habit among translators. I saw it already in tree modules.
It is also seen in the Latvian Glück module that I have been working on somewhat last week to help Jānis V.
All the book abbreviations end with a period.
But the period is also used for other purposes.
Comma is used between chapter and verse.
Aside: Most confusing to read for an Englishman.
IMHO. There ought to be a rule for translators that if a book “abbreviation” is not really an abbreviation then there should be no period.
So none after Job Amos Joel (e.g) But this rule is often ignored.
Another quirk is whether or not there’s a space after the period at the end of the book abbreviation.
Messy it can be....
I should have added
I typed a 5 but CodeHub or GutHub changed it to a 1.
IMHO. There ought to be a rule for translators that if a book “abbreviation” is not really an abbreviation then there should be no period.
Yes! This is the difficulty standardization...
Unless all of the characters used for separators in the references are different, orefs will not be able to process the references. There is no way around this requirement for orefs. orefs will never be able to handle all possible ways a reference can be written. It's just not possible.
I have changed orefs so it doesn't break multiple references apart before processing book abbreviations. This means it will process the references with book abbreviations that include a period even when a period is used elsewhere in the reference.
I have looked at the osis file for the swe1917 module. There is no consistency with the references in that file. Some are written one way, others are written another. It's extremely messy. orefs expects consistency in how the references are written. Without that consistency it will not be able to process the references.
I have changed orefs so it doesn't break multiple references apart before processing book abbreviations. This means it will process the references with book abbreviations that include a period even when a period is used elsewhere in the reference.
Currently this doesn't work:
osisRef="1Kgs.29.45 1Kgs.26.11">2 Mos. 29,45. 3 Mos. 26,11.<!-- orefs - unprocessed reference --></reference></note> <verse eID="1Kgs.6.13"/></p>
Is it possible orefs.py can "understand" that a same character can be use for two different separation. Ex, for Ndebele a period is used for chapter and verse separation, and for references separation. To be exact, the period is used when the ref is of an other book, but if the ref is of the same book Ndebele use a semi-colomn. Other question, is to be possible to add a option in orefs for the character that designates the word "following"? In French s. or ss. if they are more than 1 verse, in swedish f. and ff. Or the only wait is to fix it with an usfm tag?
Is it possible orefs.py can "understand" that a same character can be use for two different separation.
no, it's not possible with orefs. all separation characters have to be different.
Other question, is to be possible to add a option in orefs for the character that designates the word "following"?
I will look into this. It's not something that I will be able to do quickly, though.
How is it that humans can still make sense of references where the same punctuation mark is used for different purposes?
I have a very special type of ref in the swe1917. All books names end with a ".", ex:
But the character which separates multiple references is also a ".". Look:
1 Mos. 17,1. 26,24. 35,11.
Then orefs.py don't convert it with the good book ref, but with the current book ref. Ex:<reference osisRef="Exod.78.44 Exod.105.29">Ps. 78,44. 105,29.<!-- orefs - unprocessed reference -->
the book is Exo, but the target ref is Psa, orefs convert the ref as Exo. it add also this warning :<!-- orefs - unprocessed reference -->
, I think because the refs lines end with a ".".