yinguobing / cnn-facial-landmark

Training code for facial landmark detection based on deep convolutional neural network.
MIT License
626 stars 182 forks source link

Did the draw function of prediction is used after the face detection? #69

Open abdou31 opened 5 years ago

abdou31 commented 5 years ago

Hello Yin, Actually, I tried to draw some points on an image from the dataset ( I got points from the prediction of landmark.py script) on Android, but I got a false location of points drawn. this is a log that I got from landmark.py :

[0.33135968 0.19592011 0.34212315 0.17297666 0.36624995 0.16413747 0.3894139 0.17440952 0.39828074 0.1978043 0.3891497 0.22268474 0.36345637 0.22974193 0.3401759 0.2193309 0.30167252 0.20411113 0.3167112 0.19134495 0.33793524 0.18388326 0.3642417 0.18049955 0.3903508 0.18533507 0.40906873 0.1957745 0.42142123 0.21091096 0.40550107 0.21829814 0.38345626 0.22071144 0.35900232 0.22142673 0.3363348 0.21877256 0.3161971 0.2133534 0.62843406 0.21482795 0.6389724 0.1914106 0.6628249 0.1835615 0.6858679 0.19583184 0.6946868 0.22111627 0.6840309 0.24444285 0.66027373 0.25241333 0.6351568 0.24192403 0.60499936 0.22642238 0.6210091 0.21289764 0.6423563 0.2042976 0.6685919 0.20277795 0.69201195 0.20948553 0.70882106 0.22015369 0.71931773 0.23518339 0.7076659 0.24166131 0.69054717 0.24350837 0.6694564 0.24258481 0.64537776 0.23927754 0.62199306 0.23511863] b'C:\Users\starinfo\cnn-facial-landmark\targetiris\irisdata-300VW_Dataset_2015_12_14-017-000880' reshape values [[0.33135968 0.19592011] [0.34212315 0.17297666] [0.36624995 0.16413747] [0.3894139 0.17440952] [0.39828074 0.1978043 ] [0.3891497 0.22268474] [0.36345637 0.22974193] [0.3401759 0.2193309 ] [0.30167252 0.20411113] [0.3167112 0.19134495] [0.33793524 0.18388326] [0.3642417 0.18049955] [0.3903508 0.18533507] [0.40906873 0.1957745 ] [0.42142123 0.21091096] [0.40550107 0.21829814] [0.38345626 0.22071144] [0.35900232 0.22142673] [0.3363348 0.21877256] [0.3161971 0.2133534 ] [0.62843406 0.21482795] [0.6389724 0.1914106 ] [0.6628249 0.1835615 ] [0.6858679 0.19583184] [0.6946868 0.22111627] [0.6840309 0.24444285] [0.66027373 0.25241333] [0.6351568 0.24192403] [0.60499936 0.22642238] [0.6210091 0.21289764] [0.6423563 0.2042976 ] [0.6685919 0.20277795] [0.69201195 0.20948553] [0.70882106 0.22015369] [0.71931773 0.23518339] [0.7076659 0.24166131] [0.69054717 0.24350837] [0.6694564 0.24258481] [0.64537776 0.23927754] [0.62199306 0.23511863]] marks [[37.112286 21.943052] [38.317795 19.373386] [41.019993 18.383396] [43.614357 19.533867] [44.607445 22.154081] [43.584766 24.940691] [40.707115 25.731096] [38.0997 24.565062] [33.787323 22.860447] [35.471653 21.430634] [37.848747 20.594925] [40.79507 20.21595 ] [43.719288 20.757528] [45.815697 21.926743] [47.199177 23.622028] [45.41612 24.44939 ] [42.9471 24.71968 ] [40.20826 24.799793] [37.6695 24.502527] [35.414074 23.89558 ] [70.38461 24.06073 ] [71.56491 21.437988] [74.23639 20.558887] [76.81721 21.933167] [77.80492 24.765022] [76.61146 27.3776 ] [73.95066 28.270294] [71.137566 27.095491] [67.759926 25.359306] [69.553024 23.844536] [71.9439 22.881332] [74.88229 22.71113 ] [77.50534 23.46238 ] [79.387955 24.657213] [80.56358 26.34054 ] [79.25858 27.066067] [77.341286 27.272938] [74.97912 27.169498] [72.28231 26.799084] [69.66322 26.333286]]

And that's the pseudo-code that prints previous logs:


predictions = estimator.predict(input_fn=_predict_input_fn)
        for _, result in enumerate(predictions):
            img = cv2.imread(result['name'].decode('ASCII') + '.jpg')
            print(result['logits'])
            print(result['name'])
            marks = np.reshape(result['logits'], (-1, 2)) * IMG_WIDTH
            print("reshape values  "+str(np.reshape(result['logits'], (-1,2))))
            print("marks  "+str(marks))

            for mark in marks:
                cv2.circle(img, (int(mark[0]), int(
                    mark[1])), 1, (0, 255, 0), -1, cv2.LINE_AA)
            try:
                img = cv2.resize(img, (512, 512))
                cv2.imshow('result', img)
            except Exception as e:
                print(str(e))
           # output_node_names = [n.name for n in tf.get_default_graph().as_graph_def().node]
           # print(output_node_names)
            cv2.waitKey()

I want to know what is exactly the problem, I use Canvas library on Android

From python script:

frozen

From Android script:

2019-08-07

yinguobing commented 5 years ago

I guess opencv and Android have different coordinate systems. The origin point of opencv is on the left up corner. A conversion maybe needed for drawing on Android.

Neji Abdechafi notifications@github.com于2019年8月15日 周四上午7:45写道:

Hello Yin, Actually, I tried to draw some points on an image from the dataset ( I got points from the prediction of landmark.py script) on Android, but I got a false location of points drawn. this is a log that I got from landmark.py :

[0.33135968 0.19592011 0.34212315 0.17297666 0.36624995 0.16413747 0.3894139 0.17440952 0.39828074 0.1978043 0.3891497 0.22268474 0.36345637 0.22974193 0.3401759 0.2193309 0.30167252 0.20411113 0.3167112 0.19134495 0.33793524 0.18388326 0.3642417 0.18049955 0.3903508 0.18533507 0.40906873 0.1957745 0.42142123 0.21091096 0.40550107 0.21829814 0.38345626 0.22071144 0.35900232 0.22142673 0.3363348 0.21877256 0.3161971 0.2133534 0.62843406 0.21482795 0.6389724 0.1914106 0.6628249 0.1835615 0.6858679 0.19583184 0.6946868 0.22111627 0.6840309 0.24444285 0.66027373 0.25241333 0.6351568 0.24192403 0.60499936 0.22642238 0.6210091 0.21289764 0.6423563 0.2042976 0.6685919 0.20277795 0.69201195 0.20948553 0.70882106 0.22015369 0.71931773 0.23518339 0.7076659 0.24166131 0.69054717 0.24350837 0.6694564 0.24258481 0.64537776 0.23927754 0.62199306 0.23511863]

b'C:\Users\starinfo\cnn-facial-landmark\targetiris\irisdata-300VW_Dataset_2015_12_14-017-000880' reshape values [[0.33135968 0.19592011] [0.34212315 0.17297666] [0.36624995 0.16413747] [0.3894139 0.17440952] [0.39828074 0.1978043 ] [0.3891497 0.22268474] [0.36345637 0.22974193] [0.3401759 0.2193309 ] [0.30167252 0.20411113] [0.3167112 0.19134495] [0.33793524 0.18388326] [0.3642417 0.18049955] [0.3903508 0.18533507] [0.40906873 0.1957745 ] [0.42142123 0.21091096] [0.40550107 0.21829814] [0.38345626 0.22071144] [0.35900232 0.22142673] [0.3363348 0.21877256] [0.3161971 0.2133534 ] [0.62843406 0.21482795] [0.6389724 0.1914106 ] [0.6628249 0.1835615 ] [0.6858679 0.19583184] [0.6946868 0.22111627] [0.6840309 0.24444285] [0.66027373 0.25241333] [0.6351568 0.24192403] [0.60499936 0.22642238] [0.6210091 0.21289764] [0.6423563 0.2042976 ] [0.6685919 0.20277795] [0.69201195 0.20948553] [0.70882106 0.22015369] [0.71931773 0.23518339] [0.7076659 0.24166131] [0.69054717 0.24350837] [0.6694564 0.24258481] [0.64537776 0.23927754] [0.62199306 0.23511863]] marks [[37.112286 21.943052] [38.317795 19.373386] [41.019993 18.383396] [43.614357 19.533867] [44.607445 22.154081] [43.584766 24.940691] [40.707115 25.731096] [38.0997 24.565062] [33.787323 22.860447] [35.471653 21.430634] [37.848747 20.594925] [40.79507 20.21595 ] [43.719288 20.757528] [45.815697 21.926743] [47.199177 23.622028] [45.41612 24.44939 ] [42.9471 24.71968 ] [40.20826 24.799793] [37.6695 24.502527] [35.414074 23.89558 ] [70.38461 24.06073 ] [71.56491 21.437988] [74.23639 20.558887] [76.81721 21.933167] [77.80492 24.765022] [76.61146 27.3776 ] [73.95066 28.270294] [71.137566 27.095491] [67.759926 25.359306] [69.553024 23.844536] [71.9439 22.881332] [74.88229 22.71113 ] [77.50534 23.46238 ] [79.387955 24.657213] [80.56358 26.34054 ] [79.25858 27.066067] [77.341286 27.272938] [74.97912 27.169498] [72.28231 26.799084] [69.66322 26.333286]]

And that's the pseudo-code that prints previous logs:

predictions = estimator.predict(input_fn=_predict_inputfn) for , result in enumerate(predictions): img = cv2.imread(result['name'].decode('ASCII') + '.jpg') print(result['logits']) print(result['name']) marks = np.reshape(result['logits'], (-1, 2)) * IMG_WIDTH print("reshape values "+str(np.reshape(result['logits'], (-1,2)))) print("marks "+str(marks))

    for mark in marks:
        cv2.circle(img, (int(mark[0]), int(
            mark[1])), 1, (0, 255, 0), -1, cv2.LINE_AA)
    try:
        img = cv2.resize(img, (512, 512))
        cv2.imshow('result', img)
    except Exception as e:
        print(str(e))
   # output_node_names = [n.name for n in tf.get_default_graph().as_graph_def().node]
   # print(output_node_names)
    cv2.waitKey()

I want to know what is exactly the problem, I use Canvas library on Android

  • Note that I know that you are not familiar with Android Environment but I want to know exactly if you used face detection before landmarks prediction or not? As you can see the difference is very big: Same points are drawn:

From python script:

[image: frozen] https://user-images.githubusercontent.com/19480228/63063760-2f892b00-befe-11e9-8937-b7ff0ef2c44c.PNG

From Android script:

[image: 2019-08-07] https://user-images.githubusercontent.com/19480228/63063763-36b03900-befe-11e9-8487-c18699fa4ec9.png

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/yinguobing/cnn-facial-landmark/issues/69?email_source=notifications&email_token=ACOK2BQM2ETRGZS2QXAXGADQESKH7A5CNFSM4ILZ3NS2YY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HFKPROQ, or mute the thread https://github.com/notifications/unsubscribe-auth/ACOK2BWBY3JCN25EKNK4BCLQESKH7ANCNFSM4ILZ3NSQ .

abdou31 commented 5 years ago

Thanks for your reply. Did you have an idea about how the conversion work? Note that I draw points after some conversion with numpy library ( np.array). for e.g, I draw this point: [[37.112286 21.943052]. I ask you about if you have used face detection before predicts and drawing landmarks on the face? Thanks

yinguobing commented 5 years ago

I should have watch your question more closely. Yes, face detection is required.

abdou31 commented 5 years ago

I solved this problem but I dont add face detection on Android. I draw points come from prediction of ckpts files using Imgproc.circle() function and it work ( points are drawn on the eye region.

I guess that the image( from ibug dataset) that have 112*112 dimension had passed with face detection ( annotation task using pts_tools.py). So for other using, like with webcam input I need to detect face using OpenCV on Android and set the frame to 112×112 dimension. But before that I need to get a working tflite because the tflite model gives false differents values than .ckpts files and frozen graph inference.