elucideye / acf

Aggregated Channel Feature object detection in C++ and OpenGL ES 2.0 based on https://github.com/pdollar/toolbox
BSD 3-Clause "New" or "Revised" License
49 stars 20 forks source link

Clarification for return type of chnsPyramid needed #35

Closed JN-Jones closed 6 years ago

JN-Jones commented 6 years ago

While using chnsPyramid and looking at the return type of it I've noticed that you're using std::vector<std::vector<MatP>> for the data element. My first guess was that the first vector contains the different scales while the second contains the different channels. However after some tests I've noticed that the channels are saved in one MatP object and the second vector apparently always contains only one object. Am I missing something or could the type be changed to std::vector<MatP> for simplicity?

headupinclouds commented 6 years ago

However after some tests I've noticed that the channels are saved in one MatP

It has been a while since I've looked at that code, but I recall at least a couple cases where the matlab toolbox models were storing a 2D array of cells that was empty excepted for the first row.

For reference, this is the relevant code:

https://github.com/elucideye/acf/blob/1c9fa310e2d743f7b91d44c1b1016e40376eb183/src/lib/acf/ACF.h#L340

And we're discussing array_type here (previously a boost multi dimensional array, then simplified to vector<vector<>> to remove the dependency):

    struct Pyramid
    {
        using array_type = std::vector<std::vector<MatP>>;

        Detector::Options::Pyramid pPyramid; // < exact input parameters
        int nTypes = 0;
        int nScales = 0;
        array_type data;
        std::vector<Channels::Info> info;
        std::vector<double> lambdas;
        std::vector<double> scales;
        std::vector<cv::Size2d> scaleshw;

        // .rois   - [ LEVELS x CHANNELS ] array for channel access
        std::vector<std::vector<cv::Rect>> rois;
    };

I believe it can be replaced with a 1D vector. I can take a look this weekend. We can probably bump the class version in cereal to support the new layout and handle existing/old cereal portable binary archive models as a special case.

JN-Jones commented 6 years ago

Yeah, that's the one I'm talking about. In the cases I've seen both using your code and the original toolbox I haven't seen anything that couldn't be stored in a 1D vector though I may have missed some cases. However from the basic idea of how the pyramid is calculated I don't think there's any. Probably better to run some tests before though.

headupinclouds commented 6 years ago

https://github.com/pdollar/toolbox/blob/1a3c9869033548abb0c7a3c2aa6a7902c36f39c2/channels/chnsPyramid.m#L86

%   .pChns        - parameters for creating channels (see chnsCompute.m)

https://github.com/pdollar/toolbox/blob/1a3c9869033548abb0c7a3c2aa6a7902c36f39c2/channels/chnsPyramid.m#L92

%   .nTypes       - number of channel types

https://github.com/pdollar/toolbox/blob/1a3c9869033548abb0c7a3c2aa6a7902c36f39c2/channels/chnsPyramid.m#L170-L175

if(concat && nTypes), for i=1:nScales, data{i}=cat(3,data0{i,:}); end; end

The final channel concatenation is optional:

https://github.com/elucideye/acf/blob/b02211e54f66a08d81751dce2327b47de9c05e75/src/lib/acf/chnsPyramid.cpp#L392-L401

    if (concat && nTypes)
    {
        auto data0 = data;
        data.resize(nScales);
        for (int i = 0; i < nScales; i++)
        {
            data[i].resize(1);
            fuseChannels(data0[i].begin(), data0[i].end(), data[i][0]);
        }
    }

So a 2D array may be needed to support some cases.