Ldpe2G / ArmNeonOptimization

Arm neon optimization practice
MIT License
387 stars 104 forks source link

Result seems off in my side #3

Open ziyuang opened 5 years ago

ziyuang commented 5 years ago

My run produces (left: input; middle: my implementation; right: this implementation; radius=10):

noisy_square

Can you try this input? It's a 201x201 FP32 square, stored with little-endian: noisy-square.tar.gz

I use this routine to read the file:

template<typename T>
std::vector<T> read_raw(const std::string &path, bool little_endian = false)
{
    std::ifstream file(path, std::ifstream::binary);
    std::streamsize buffer_size = 1 << 20;
    char buffer[buffer_size];
    std::vector<T> raw;
    while(!file.eof())
    {
        file.read(buffer, buffer_size);
        for (int i = 0; i < std::min(buffer_size, file.gcount()); i += sizeof(T))
        {
            char *begin = &buffer[i];
            if (little_endian)
                std::reverse(begin, begin + sizeof(T));
            raw.push_back(*(T *)begin);
        }
    }
    file.close();
    return raw;
}

Thank you.