Commit 264c71cd886f59b96323db260079574420d2c9ef added a valid docs count to the metadata for some input readers.
This is useful for a lot of things, but this feature is not complete. The count cannot be relied upon, except when it's come straight from the input reader. Further down the pipeline, there might be more invalid documents, but the count is not currently updated. All internal modules should update (or at least remove) the count, just as they do currently with the length count.
Commit 264c71cd886f59b96323db260079574420d2c9ef added a valid docs count to the metadata for some input readers.
This is useful for a lot of things, but this feature is not complete. The count cannot be relied upon, except when it's come straight from the input reader. Further down the pipeline, there might be more invalid documents, but the count is not currently updated. All internal modules should update (or at least remove) the count, just as they do currently with the length count.