JPG import gets stuck - Githubissues

bhive01 commented 7 years ago

I have outlines of fruit that I'm trying to import, but for some reason it keeps getting stuck seemingly randomly during the import process.

> momoc_list <- list.files("~/Google Drive/Syngenta/Cucurbitaceae/Squash/2015-06 MAAG Squash DVD/Images/2016-10-09_15-12_Results/Tue_11_Oct_2016_1222_20/midline/withoutPeduncle", pattern = "\\.(jpg|JPG)$", full.names = TRUE)
> squash_shapes <- import_jpg(momoc_list, threshold = 0.5)
Extracting 334.jpg outlines...
[ 1 / 334 ]  IMG_2704.JPGcolcor.jpg_Fruit_1midline.jpg

Sometimes it stops after just a few, other times after more than 100, but after 45 minutes of trying it has not completed the import fully.

When it gets stuck, one core is stuck at 100% usage and no further updates happen. I can cancel with an 'esc' so it doesn't lock R completely.

How can I pin down what it is doing to help get a fix? CC: @DanChitwood

> devtools::session_info()
Session info ------------------------------------------------------------------------------------------------------------------------------------------------------------
 setting  value                                      
 version  R version 3.3.1 Patched (2016-10-14 r71527)
 system   x86_64, darwin13.4.0                       
 ui       AQUA                                       
 language (EN)                                       
 collate  en_US.UTF-8                                
 tz       America/Los_Angeles                        
 date     2016-10-17                                 

Packages ----------------------------------------------------------------------------------------------------------------------------------------------------------------
 package     * version    date       source                           
 ape           3.5        2016-05-24 CRAN (R 3.3.0)                   
 assertthat    0.1        2013-12-06 CRAN (R 3.3.0)                   
 broom       * 0.4.1      2016-06-24 CRAN (R 3.3.0)                   
 coda          0.18-1     2015-10-16 CRAN (R 3.3.0)                   
 colorspace    1.2-7      2016-10-11 CRAN (R 3.3.0)                   
 DBI           0.5-1      2016-09-10 CRAN (R 3.3.0)                   
 deSolve       1.14       2016-09-05 CRAN (R 3.3.0)                   
 devtools      1.12.0     2016-06-24 CRAN (R 3.3.0)                   
 digest        0.6.10     2016-08-02 CRAN (R 3.3.0)                   
 dplyr       * 0.5.0      2016-06-24 CRAN (R 3.3.0)                   
 geiger        2.0.6      2015-09-07 CRAN (R 3.3.0)                   
 geometry      0.3-6      2015-09-09 CRAN (R 3.3.0)                   
 geomorph      3.0.3      2016-09-09 CRAN (R 3.3.0)                   
 ggplot2     * 2.1.0      2016-05-28 Github (hadley/ggplot2@b181e9a)  
 gtable        0.2.0      2016-02-26 CRAN (R 3.3.0)                   
 htmltools     0.3.5      2016-03-21 CRAN (R 3.3.0)                   
 htmlwidgets   0.7        2016-08-02 CRAN (R 3.3.0)                   
 httpuv        1.3.3      2015-08-04 CRAN (R 3.3.0)                   
 jpeg          0.1-8      2014-01-23 CRAN (R 3.3.0)                   
 jsonlite      1.1        2016-09-14 CRAN (R 3.3.0)                   
 knitr         1.14       2016-08-13 CRAN (R 3.3.0)                   
 lattice       0.20-34    2016-09-06 CRAN (R 3.3.1)                   
 lazyeval      0.2.0      2016-06-12 CRAN (R 3.3.0)                   
 lubridate   * 1.5.6.9000 2016-05-16 Github (hadley/lubridate@5b8c8fe)
 magic         1.5-6      2013-11-20 CRAN (R 3.3.0)                   
 magrittr    * 1.5        2014-11-22 CRAN (R 3.3.0)                   
 MASS          7.3-45     2016-04-21 CRAN (R 3.3.1)                   
 Matrix        1.2-7.1    2016-09-01 CRAN (R 3.3.1)                   
 memoise       1.0.0      2016-01-29 CRAN (R 3.3.0)                   
 mime          0.5        2016-07-07 CRAN (R 3.3.0)                   
 mnormt        1.5-4      2016-03-09 CRAN (R 3.3.0)                   
 Momocs      * 1.0.13     2016-10-17 Github (vbonhomme/Momocs@7f541c3)
 munsell       0.4.3      2016-02-13 CRAN (R 3.3.0)                   
 mvtnorm       1.0-5      2016-02-02 CRAN (R 3.3.0)                   
 nlme          3.1-128    2016-05-10 CRAN (R 3.3.1)                   
 plyr          1.8.4      2016-06-08 CRAN (R 3.3.0)                   
 psych         1.6.6      2016-06-28 CRAN (R 3.3.0)                   
 purrr       * 0.2.2      2016-06-18 CRAN (R 3.3.0)                   
 R6            2.2.0      2016-10-05 CRAN (R 3.3.0)                   
 Rcpp          0.12.7     2016-09-05 CRAN (R 3.3.0)                   
 readr       * 1.0.0      2016-08-03 CRAN (R 3.3.0)                   
 reshape2      1.4.1      2014-12-06 CRAN (R 3.3.0)                   
 rgl           0.96.0     2016-08-25 CRAN (R 3.3.0)                   
 scales        0.4.0      2016-02-26 CRAN (R 3.3.0)                   
 shiny         0.14.1     2016-10-05 CRAN (R 3.3.0)                   
 sp            1.2-3      2016-04-14 CRAN (R 3.3.0)                   
 stringi       1.1.2      2016-10-01 CRAN (R 3.3.0)                   
 stringr       1.1.0      2016-08-19 CRAN (R 3.3.0)                   
 subplex       1.1-6      2015-07-11 CRAN (R 3.3.0)                   
 tibble      * 1.2        2016-08-26 CRAN (R 3.3.0)                   
 tidyr       * 0.6.0.9000 2016-08-26 Github (cpsievert/tidyr@7822f7a) 
 tidyverse   * 0.0.0.9000 2016-09-06 Github (hadley/tidyverse@7706c5e)
 withr         1.0.2      2016-06-20 CRAN (R 3.3.0)                   
 xtable        1.8-2      2016-02-05 CRAN (R 3.3.0)

bhive01 commented 7 years ago

A few examples of the images. They are RVB/RGB and are not completely masks thanks to a yellow outline and a red midline.

img_2704 jpgcolcor jpg_fruit_1midline

img_2786 jpgcolcor jpg_fruit_1midline

img_2859 jpgcolcor jpg_fruit_1midline

vbonhomme commented 7 years ago

Have you tried without the yellow and red lines ?
I guess you're interested in importing coordinates of both outline/curve here in yellow/red ?

bhive01 commented 7 years ago

@vbonhomme, Yes the lines were the issue. These images are "false color" outputs from ImageJ that show the outline of the fruit and the midline estimate (not always great, see last fruit). I was hoping to not have to run my macro again (because it takes hours) so I wrote a simple python/openCV script to find the largest object, fill it in and output that. Then Momoc::read.jpg() was able to complete.

Thanks for your help.

import argparse
import sys
import os.path
import trans #pip install trans
import time
import datetime
import cv2
import math

import numpy as np

# set up parser to take input of folder with images to loop through
ap = argparse.ArgumentParser()
ap.add_argument("--path", help = "Path to folder containing images")

args = ap.parse_args()

# create new folder for storing images (resolution to minute so can overwrite if you start two processes quickly after each other)
currentTime = datetime.datetime.now().strftime('%Y-%m-%d_%H-%M-%S')
foldername = '{0}_BWimages'.format(currentTime)

savedir = os.path.join(args.path, foldername)
os.makedirs(savedir)

#define types of tiles we're interested in
filetypes = tuple([".JPG", ".jpg", ".JPEG", ".jpeg"])

# read the filelist from the path directory
filelist = [f for f in os.listdir(args.path) if f.endswith(filetypes)]
# how long is the list of files for progress reporting
listlength = len(filelist)

for fileindex, filename in enumerate(filelist):
    # progress bar
    pctdone = int(float(fileindex)/float(listlength) * 100)
    sys.stdout.write("\rPercentDone: %d%%" % pctdone)
    sys.stdout.flush()

    # read in image
    img = cv2.imread(os.path.join(args.path, filename))

    # get image attributes
    height, width, channels = img.shape

    # create gray image for further processing
    gray_img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

    #slight blur and then otsu thresholding
    gaussian = cv2.GaussianBlur(gray_img,(5,5),0)
    ret, threshold = cv2.threshold(gaussian, 0, 255, cv2.THRESH_BINARY+cv2.THRESH_OTSU) 

    # invert image to find contours
    des = cv2.bitwise_not(threshold)

    #find contours
    _, contours, hierarchy = cv2.findContours(des, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)

    # https://stackoverflow.com/questions/25552765/python-opencv-second-largest-object
    areaArray = []

    #fill all contours in
    for i, c in enumerate(contours):
        area = cv2.contourArea(c)
        areaArray.append(area)

    #first sort the array by area
    sorteddata = sorted(zip(areaArray, contours), key=lambda x: x[0], reverse=True)

    #find the nth largest contour [n-1][1], in this case 2
    largestcontour = sorteddata[0][1]

    out_im = np.zeros((height, width, 1), np.uint8)

    out_im = cv2.drawContours(out_im, [largestcontour], -1, 255, cv2.FILLED)  # set everything to white inside all contours

    out_im = (255-out_im)

    #save out thresholded and filled in masks
    savefilename = '{0}'.format(filename)
    savefilecomplete = os.path.join(savedir, savefilename)
    cv2.imwrite(savefilecomplete, out_im, [int(cv2.IMWRITE_JPEG_QUALITY), 90])

MomX / Momocs

JPG import gets stuck #166