Closed jmbuena closed 4 years ago
Para ello habrá que hacer uso de los lambdas que vienen del detector guardado en disco.
He estado trabajando sobre esta parte y en unas primeras pruebas que he hecho parece que funciona correctamente. Había empezado las modificaciones sobre la versión anterior y he terminado sobre esa, en cuanto lo modifique para esta nueva estructura hago el commit.
Tras las primeras pruebas, las detecciones no son completamente iguales, pero si que viendo las imágenes las ventanas se sitúan en las mismas zonas.
Utilizando la imagen coche_solo1.png los resultados son:
TestDetectorPyramidComputeAllStrategy [ x=-13, y=3, w=138, h=51 ] [ x=-21, y=7, w=146, h=51 ] [ x=-17, y=7, w=146, h=51 ] [ x=7, y=3, w=91, h=55 ] [ x=9, y=8, w=88, h=55 ] [ x=14, y=8, w=88, h=55 ] [ x=14, y=12, w=88, h=55 ] [ x=-24, y=-1, w=159, h=59 ] [ x=5, y=3, w=94, h=59 ] [ x=9, y=3, w=94, h=59 ] [ x=9, y=8, w=94, h=59 ]
detections.size() = 11
TestDetectorPyramidApproximatedStrategy [ x=-22, y=8, w=158, h=55 ] [ x=5, y=3, w=94, h=59 ] [ x=9, y=3, w=94, h=59 ] [ x=9, y=8, w=94, h=59 ] detections.size() = 4
Utilizando la imagen coches.png TestDetectorPyramidComputeAllStrategy [ x=227, y=265, w=196, h=68 ] [ x=243, y=282, w=109, h=68 ] [ x=254, y=264, w=115, h=72 ] [ x=220, y=264, w=207, h=72 ] [ x=190, y=271, w=206, h=72 ] [ x=242, y=271, w=115, h=72 ] [ x=208, y=271, w=195, h=72 ] [ x=184, y=277, w=206, h=72 ] [ x=196, y=277, w=195, h=72 ] [ x=242, y=277, w=115, h=72 ] [ x=233, y=264, w=124, h=78 ] [ x=196, y=264, w=211, h=78 ] [ x=209, y=264, w=211, h=78 ] [ x=216, y=264, w=224, h=78 ] [ x=255, y=264, w=172, h=78 ] [ x=183, y=271, w=211, h=78 ] [ x=233, y=271, w=124, h=78 ] [ x=196, y=271, w=211, h=78 ] [ x=233, y=277, w=124, h=78 ] [ x=222, y=263, w=132, h=83 ] [ x=176, y=263, w=238, h=83 ] [ x=190, y=263, w=224, h=83 ] [ x=204, y=263, w=224, h=83 ] [ x=217, y=263, w=239, h=83 ] [ x=252, y=263, w=183, h=83 ] [ x=222, y=269, w=132, h=83 ] [ x=183, y=269, w=224, h=83 ] [ x=190, y=269, w=224, h=83 ] [ x=215, y=266, w=142, h=89 ] [ x=222, y=266, w=142, h=89 ] [ x=180, y=266, w=241, h=89 ] [ x=252, y=266, w=142, h=89 ] [ x=222, y=273, w=142, h=89 ] [ x=170, y=262, w=260, h=96 ] [ x=146, y=243, w=765, h=271 ] [ x=163, y=243, w=777, h=271 ] [ x=289, y=187, w=463, h=290 ] [ x=313, y=187, w=463, h=290 ] [ x=289, y=211, w=463, h=290 ] [ x=313, y=211, w=463, h=290 ] [ x=110, y=236, w=819, h=290 ] [ x=313, y=236, w=463, h=290 ] [ x=349, y=175, w=361, h=310 ] [ x=256, y=201, w=495, h=310 ] [ x=282, y=201, w=495, h=310 ] [ x=66, y=227, w=875, h=310 ] [ x=282, y=227, w=495, h=310 ] [ x=323, y=161, w=391, h=336 ] [ x=251, y=189, w=536, h=336 ] detections.size() = 49
TestDetectorPyramidApproximatedStrategy [ x=260, y=283, w=115, h=72 ] [ x=238, y=283, w=207, h=72 ] [ x=208, y=289, w=206, h=72 ] [ x=260, y=289, w=115, h=72 ] [ x=266, y=289, w=115, h=72 ] [ x=208, y=295, w=206, h=72 ] [ x=260, y=295, w=115, h=72 ] [ x=266, y=295, w=115, h=72 ] [ x=274, y=277, w=172, h=78 ] [ x=253, y=284, w=124, h=78 ] [ x=259, y=284, w=124, h=78 ] [ x=228, y=284, w=211, h=78 ] [ x=235, y=284, w=224, h=78 ] [ x=203, y=290, w=224, h=78 ] [ x=259, y=290, w=124, h=78 ] [ x=222, y=290, w=211, h=78 ] [ x=253, y=297, w=124, h=78 ] [ x=206, y=276, w=219, h=83 ] [ x=225, y=276, w=224, h=83 ] [ x=238, y=276, w=239, h=83 ] [ x=273, y=276, w=183, h=83 ] [ x=190, y=283, w=238, h=83 ] [ x=250, y=283, w=132, h=83 ] [ x=211, y=283, w=224, h=83 ] [ x=233, y=283, w=221, h=83 ] [ x=197, y=290, w=224, h=83 ] [ x=250, y=290, w=132, h=83 ] [ x=257, y=290, w=132, h=83 ] [ x=230, y=281, w=142, h=89 ] [ x=195, y=281, w=241, h=89 ] [ x=202, y=281, w=241, h=89 ] [ x=260, y=281, w=200, h=89 ] [ x=245, y=288, w=142, h=89 ] [ x=252, y=288, w=142, h=89 ] [ x=170, y=262, w=260, h=96 ] [ x=248, y=298, w=164, h=103 ] [ x=200, y=298, w=278, h=103 ] [ x=209, y=298, w=278, h=103 ] [ x=248, y=307, w=164, h=103 ] [ x=200, y=307, w=278, h=103 ] [ x=209, y=307, w=278, h=103 ] [ x=248, y=291, w=176, h=110 ] [ x=248, y=300, w=176, h=110 ] [ x=90, y=282, w=249, h=156 ] [ x=386, y=236, w=463, h=290 ] [ x=410, y=236, w=463, h=290 ] [ x=498, y=236, w=337, h=290 ] [ x=201, y=260, w=831, h=290 ] [ x=498, y=260, w=337, h=290 ] [ x=201, y=284, w=831, h=290 ] [ x=225, y=284, w=831, h=290 ] [ x=498, y=284, w=337, h=290 ] [ x=225, y=308, w=783, h=290 ] [ x=214, y=201, w=837, h=310 ] [ x=359, y=227, w=495, h=310 ] [ x=385, y=227, w=495, h=310 ] [ x=359, y=253, w=495, h=310 ] [ x=333, y=279, w=495, h=310 ] [ x=359, y=279, w=495, h=310 ] [ x=478, y=279, w=361, h=310 ] [ x=359, y=305, w=495, h=310 ] [ x=214, y=305, w=837, h=310 ] [ x=363, y=217, w=536, h=336 ] [ x=205, y=217, w=907, h=336 ] [ x=335, y=245, w=536, h=336 ] [ x=363, y=245, w=536, h=336 ] [ x=335, y=273, w=536, h=336 ] [ x=363, y=273, w=536, h=336 ] [ x=204, y=273, w=908, h=336 ] [ x=335, y=301, w=536, h=336 ] [ x=363, y=301, w=536, h=336 ] [ x=382, y=301, w=553, h=336 ] [ x=401, y=199, w=413, h=355 ] [ x=353, y=199, w=567, h=355 ] [ x=324, y=229, w=567, h=355 ] [ x=353, y=229, w=567, h=355 ] [ x=324, y=259, w=567, h=355 ] [ x=353, y=259, w=567, h=355 ] [ x=338, y=232, w=654, h=410 ] [ x=338, y=266, w=654, h=410 ] detections.size() = 80
Estrategia completa:
Escalas aproximadas:
Terminada de arreglar en este commit 51ccca4b817b81b87cc285bd80765cb3989b61bc.
En realidad había que calcular los canales ACF (reales y aproximados) antes de postprocesarlos (con convTri y crop). Después pasarlos a los filtros del LDCF. Ese era el principal problema.
Se trata de completar la implementación del cálculo de canales aproximando muchos de ellos con los calculados en octavas completas. Esto es, completar la implementación para que funcione como la de P. Dollar. Habrá que comprobar la velocidad también (bastará con implementar el test correspondiente en los de ChannelsPyramid).