openzim / libzim

Reference implementation of the ZIM specification
https://download.openzim.org/release/libzim/
GNU General Public License v2.0
164 stars 50 forks source link

Potential minor difference between specification and implementation of counter #838

Closed IMayBeABitShy closed 10 months ago

IMayBeABitShy commented 10 months ago

Hi everyone,

this is just a very minor issue, but I think I may have spotted a slight deviation between the way the specification describes the M/Counter metadata and how the libzim writer implements it.

The specification describes the M/Counter metadata as follows:

Number of non-redirect entries per mime-type

While libzim writer implements the counter incrementation as follows:

void CounterHandler::handle(Dirent* dirent, std::shared_ptr<Item> item)
{
  if (dirent->getNamespace() != NS::C) {
    return;
  }
  auto mimetype = item->getMimeType();
  if (mimetype.empty()) {
    return;
  }
  m_mimetypeCounter[mimetype] += 1;
}

The difference here is that the specification makes it sound like all non-redirect entries should be counted regardless of the namespace of the entry while libzim seems to only count non-redirect entries in the C namespace.

This is only a very minor deviation and I think that libzim behavior is more reasonable (mimetype count is mostly interesting to inform user about actual content), but I just wanted to mention this.

kelson42 commented 10 months ago

@mgautierfr This seems correct, should we update the spec to specify that we deal wuth the C namespace?

mgautierfr commented 10 months ago

Indeed. We should update the spec. Only non-redirect in "User Content" namespace are (and should) be counted.

kelson42 commented 10 months ago

Fixed