raffaeldantas / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
1 stars 0 forks source link

Tesseract 3.0.3 version numbers are quite hard to follow #1424

Closed GoogleCodeExporter closed 8 years ago

GoogleCodeExporter commented 8 years ago
The tesseract web site indicates that the development of tesseract
is at 3.03RC1.

The latest tag for tesseract on the source VCS is also at 3.03RC1.

The latest release notes of tesseract indicate 3.03RC1 (4 Feb 2014) as the 
latest releases. 

At the same time, debian and ubuntu systems install 3.03.03.

One would say that this is an issue of the debian packagers. But then, if you 
look at the debian changelog, it says that they have updated the release 
numbers following "new upstream releases".

And indeed, looking at the commit logs of the VCS tree you find mentions to 
3.03.02.

To complicate the matter further, fedora seems to have a 3.03-0.4RC1.

Unfortunately, all this makes it extremely hard to find the upstream source 
that actually corresponds to a distribution release and check in the revision 
history if some bug has been fixed, is still open, is worth reporting, etc.

The fact that distributions package 3.03 with the language files of 3.02 make 
it even more challenging.

Just a few questions:

- Has 3.03 become stable? The 3.03.03 release number with no RC postfix in the 
debian/ubuntu versions suggest so, but the 3.03-04RC1 of fedora suggests the 
contrary.
- Is the fedora 3.03-03RC1 release based on the same upstream as debian 
3.03.03? In other words is there a 3.03.03 "official" upstream? At what git 
commit id?
- What is the commit id of 3.03-04?
- If 3.03-04 is an RC, is it considered stable enough that it was 'announced' 
as a release? Where? Google reports no answers to "tesseract 3.03-04"...
- Should it be 3.03-04 or 3.03.04?
- Is the bug regarding 'disappearing lines' when producing pdf output fixed in 
3.03-04?

Original issue reported on code.google.com by sergio.c...@gmail.com on 18 Feb 2015 at 10:33

GoogleCodeExporter commented 8 years ago
I forgot to mention that 3.03.03 from debian/ubuntu declares itself as being 
3.03 with no further indication when called with "--version" 

Original comment by sergio.c...@gmail.com on 18 Feb 2015 at 10:36

GoogleCodeExporter commented 8 years ago
This is not tesseract-ocr issue. Please complain to packagers at relevant 
distribution.
3.03 version was not and will not be released. Search tesseract-dev forum for 
more details.

Original comment by zde...@gmail.com on 19 Feb 2015 at 9:48

GoogleCodeExporter commented 8 years ago
This is what I initially thought.

But I really cannot understand:

1) How it is possible that so many distros get this wrong?
  - Debian 3.03.03
  - Ubuntu 3.03.03
  - Fedora rawhide 3.03-0.4.RC1
  - Fedora 21 3.03-0.2.RC1
was there any issue in communication?

2) Most important... how it is that these "unofficial" release numbers make it 
into
the official project repository in commit comments? For instance
  - fix PDF rendering for Arabic. http://ftp.de.debian.org/debian/pool/main/t/tesser
act/tesseract_3.03.02-3.diff.gz

In any case, I'll open a bug in ubuntu.

Original comment by sergio.c...@gmail.com on 20 Feb 2015 at 9:57