AmitGorvadiya / tesseract-ocr

Automatically exported from code.google.com/p/tesseract-ocr
Other
0 stars 0 forks source link

Simple PHP script to view box files #180

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Hi, great work so far.
I threw this together so I could easily view and edit box files. It might
not be as full-featured as tessboxes, but I would say it's a little easier
to set up and use if you've got a server to run it off of.
Please note that this script allows people to upload content to your
server, and thus shouldn't be made public. Instead, it's meant strictly for
internal use.

<?php
/* boxfilereader.php 
 * By Trevor Oldak
 * Takes an image that was used for training in tesseract, and
 * the resulting box file, and shows the location of each box
 * on the image.
 * 
 * Requires: 
 * PHP
 * Firefox (for the javascript to work right)
 * ImageMagick (for converting tif to jpg)
 * GD for image creation/manipulation
 */

//The location for the original image, converted to jpeg
$img_name = "image.jpg";
//The temp location of the tif image. Gets deleted when script executes
$tif_name = "image.tif";
//The location of the output box image
$box_png = "boxes.png";

if(isset($_POST['readboxfile'])){
    //Convert the image to jpg, then delete the tif
    move_uploaded_file($_FILES['image']['tmp_name'],$tif_name);
    exec("convert $tif_name $img_name");
    unlink($tif_name);

    $img_size = getimagesize($img_name);
    $img_width = $img_size[0];
    $img_height = $img_size[1];

    $img = imagecreatetruecolor($img_width,$img_height);
    $black = imagecolorallocate($img, 0, 0, 0);
    imagecolortransparent($img, $black);
    //Red, green, and blue so that boxes are more easily distinguished from
one another
    $colors = array(imagecolorallocate($img, 255, 0, 0),
imagecolorallocate($img, 0, 255, 0), imagecolorallocate($img, 0, 0, 255));

    $boxes = file_get_contents($_FILES['box']['tmp_name']);
    $boxes = explode("\n", $boxes);
    foreach($boxes as $index => $row){

        $row = explode(" ", $row);
        if(count($row) == 5) imagerectangle($img, $row[1], $img_height-$row[2],
$row[3], $img_height-$row[4], $colors[$index%3]);
    }

    //Save the image as a png
    imagepng($img, $box_png);
    imagedestroy($img)
?>
<html>
    <head>
    <style>
    body{
        padding:0px;
        margin:0px;
    }

    #frame{
            width:<?php echo $img_width ?>px;
            height:<?php echo $img_height ?>px;
            background-image: url(<?php echo $img_name ?>);
    }
    </style>
    <script>

        //Prints the coordinates (from the lower left) of the click.
        //Lazy hack, only works from the corner of the screen, not
        //where the image was actually clicked.
        function getCoords(e){
            document.getElementById('foo').value = e.pageX;
            document.getElementById('bar').value = <?php echo $img_height ?>-e.pageY;
        }
    </script>
    </head>
    <body>
        <div id="frame">
            <img src="<?php echo $box_png ?>" onClick="javascript:getCoords(event)"/>
        </div>
        <br/>
        x: <input type="text" id="foo"/><br/>
        y: <input type="text" id="bar"/>
    </body>
</html>
<?php 
}else{
?>
<html>
    <body>
        <form method="POST" enctype="multipart/form-data" action="#">
            Image file: <input type="file" name="image"/><br/>
            Box file: <input type="file" name="box"/><br/>
            <input type="hidden" name="readboxfile" value="true"/>
            <input type="submit" value="Go!"/>
        </form>
    </body>
</html>
<?php
}
?>

Original issue reported on code.google.com by trev...@gmail.com on 8 Jan 2009 at 4:51

GoogleCodeExporter commented 9 years ago
Thanks for the contribution.
If you want, and you add an Apache license header, it could go in the downloads 
section.

Original comment by theraysm...@gmail.com on 10 Mar 2009 at 8:51

GoogleCodeExporter commented 9 years ago
Ok, license added. I attached the file to this post. I also made a few changes 
since
my first post:
1. Instead of clicking for coordinates, you can now drag a bounding box around a
character and it will alert you with the coordinates you dragged, which you can 
paste
back into your box file.
2. You can specify min/max width/height to quickly filter out invalid garbage 
boxes.
The filter also removes tildes, which are for 0% confidence characters. The 
cleaned
output is printed to a textarea.
3. Cursor replaced with a crosshair and other minor visual improvements.
4. Uploading the training image is now optional, to save on refresh time.

Please also note that I wrote this code for a personal project, so i customized 
it to
fit my needs only, so it may take some tweaking to suit yours.

Original comment by trev...@gmail.com on 10 Mar 2009 at 11:09

Attachments:

GoogleCodeExporter commented 9 years ago
Thanks. Uploaded.
Ray.

Original comment by theraysm...@gmail.com on 12 Mar 2009 at 10:44

GoogleCodeExporter commented 9 years ago
what to do if ImageMagick is not in my pc?

Original comment by concepti...@gmail.com on 24 Apr 2010 at 4:46