frictionlessdata / frictionless-py

Data management framework for Python that provides functionality to describe, extract, validate, and transform tabular data
https://framework.frictionlessdata.io
MIT License
722 stars 148 forks source link

Package validation clarifications #936

Closed niconoe closed 3 years ago

niconoe commented 3 years ago

Overview

Hello, I am trying to validate a package in Python (using report = validate_package(descriptor_data_dict)), but gets weird behaviour:

It's unclear to me if I this is a bug in validate_package or if I'm misusing/misunderstanding the content of the Report object (its reference documentation at https://framework.frictionlessdata.io/docs/references/api-reference#report seems incomplete and not very detailed, compared to what I see in my Python debugger.

Maybe it's better to completely ignore report.valid and assume the report is valid if it has no errors?

For debugging purposes, here is the descriptor that triggers this behaviour:

descriptor_data_dict = {
    "name": "mica-muskrat-and-coypu-20210707160815",
    "id": "https://doi.org/10.5281/zenodo.4893244",
    "profile": "http://localhost:52880/camtrap-dp-profile.json",
    "created": "2021-07-07T16:08:15Z",
    "sources": [
        {"title": "Agouti", "path": "https://www.agouti.eu", "email": "agouti@wur.nl"}
    ],
    "contributors": [
        {
            "title": "Abel De Boer",
            "email": "adeboer@wetterskipfryslan.nl",
            "role": "contributor",
        },
        {
            "title": "Axel Neukermans",
            "email": "axel.neukermans@inbo.be",
            "role": "author",
            "organization": "Research Institute for Nature and Forest (INBO)",
        },
        {
            "title": "Björn Matthies",
            "email": "bjoern.matthies@lwk-niedersachsen.de",
            "role": "contributor",
        },
        {
            "title": "Brecht Neukermans",
            "email": "brecht.neukermans@inbo.be",
            "role": "author",
            "organization": "Research Institute for Nature and Forest (INBO)",
        },
        {
            "title": "Claudia Maistrelli",
            "email": "claudia.maistrelli@tiho-hannover.de",
            "role": "contributor",
        },
        {
            "title": "Danny Van der beeck",
            "email": "daniel.vanderbeeck@gmail.com",
            "role": "contributor",
        },
        {
            "title": "Dan Slootmaekers",
            "email": "d.slootmaekers@vmm.be",
            "role": "contributor",
        },
        {
            "title": "Dennis Donckers",
            "email": "dennis.donckers2@telenet.be",
            "role": "contributor",
        },
        {
            "title": "Emma Cartuyvels",
            "email": "emma.cartuyvels@inbo.be",
            "role": "author",
            "organization": "Research Institute for Nature and Forest (INBO)",
        },
        {
            "title": "Frank Huysentruyt",
            "email": "frank.huysentruyt@inbo.be",
            "role": "contributor",
            "organization": "Research Institute for Nature and Forest (INBO)",
        },
        {
            "title": "Friederike Gethöffer",
            "email": "friederike.gethoeffer@tiho-hannover.de",
            "role": "contributor",
        },
        {
            "title": "Heiko Fritz",
            "email": "foersterheiko@gmx.de",
            "role": "contributor",
        },
        {
            "title": "Jan Lodewijkx",
            "email": "j.lodewijkx@vmm.be",
            "role": "contributor",
        },
        {
            "title": "Jim Casaer",
            "email": "jim.casaer@inbo.be",
            "role": "maintainer",
            "organization": "Research Institute for Nature and Forest (INBO)",
        },
        {"title": "Kurt Schamp", "email": "kurt.schamp@inbo.be", "role": "contributor"},
        {
            "title": "Lilja Fromme",
            "email": "lilja.fromme@tiho-hannover.de",
            "role": "contributor",
        },
        {
            "title": "Lydia Liebgott",
            "email": "lydia.liebgott@tiho-hannover.de",
            "role": "contributor",
        },
        {
            "title": "Peter Desmet",
            "email": "peter.desmet@inbo.be",
            "path": "https://orcid.org/0000-0002-8442-8025",
            "role": "maintainer",
            "organization": "Research Institute for Nature and Forest (INBO)",
        },
        {
            "title": "Tim Adriaens",
            "email": "tim.adriaens@inbo.be",
            "role": "contributor",
            "organization": "Research Institute for Nature and Forest (INBO)",
        },
        {
            "title": "Yasmine Verzelen",
            "email": "yasmine.verzelen@inbo.be",
            "role": "contributor",
            "organization": "Research Institute for Nature and Forest (INBO)",
        },
        {
            "title": "Yorick Liefting",
            "email": "yorick.liefting@wur.nl",
            "role": "maintainer",
            "organization": "Wageningen University",
        },
    ],
    "organizations": [
        {
            "title": "Research Institute for Nature and Forest (INBO)",
            "path": "https://inbo.be",
        }
    ],
    "project": {
        "title": "MICA - Muskrat and Coypu",
        "description": "This project is part of the LIFE project MICA, in which innovative techniques are tested for a more efficient control of muskrat and coypu populations, both invasive species. Camera traps were located in areas where the presence of muskrat and/or coypu was suspected.",
        "samplingDesign": "targeted",
        "captureMethod": ["motion detection", "time lapse"],
        "animalTypes": ["unmarked"],
        "classificationLevel": "sequence",
        "sequenceInterval": 120,
        "_id": "86cabc14-d475-4439-98a7-e7b590bed60e",
    },
    "spatial": {
        "type": "Feature",
        "bbox": [3.51755, 50.69905, 7.0243, 53.27052],
        "properties": {},
        "geometry": {
            "type": "Polygon",
            "coordinates": [
                [
                    [3.51755, 50.69905],
                    [7.0243, 50.69905],
                    [7.0243, 53.27052],
                    [3.51755, 53.27052],
                    [3.51755, 50.69905],
                ]
            ],
        },
    },
    "temporal": {"start": "2019-10-09", "end": "2021-03-28"},
    "taxonomic": [
        {
            "taxonID": "DGP6",
            "taxonIDReference": "https://www.catalogueoflife.org",
            "scientificName": "Anas platyrhynchos",
            "vernacularNames": {"en": "mallard", "nl": "wilde eend"},
        },
        {
            "taxonID": "DGPL",
            "taxonIDReference": "https://www.catalogueoflife.org",
            "scientificName": "Anas strepera",
            "vernacularNames": {"en": "gadwall", "nl": "krakeend"},
        },
        {
            "taxonID": "32FH",
            "taxonIDReference": "https://www.catalogueoflife.org",
            "scientificName": "Ardea",
            "vernacularNames": {"en": "great herons", "nl": "reigers"},
        },
        {
            "taxonID": "GCHS",
            "taxonIDReference": "https://www.catalogueoflife.org",
            "scientificName": "Ardea cinerea",
            "vernacularNames": {"en": "grey heron", "nl": "blauwe reiger"},
        },
        {
            "taxonID": "RQPW",
            "taxonIDReference": "https://www.catalogueoflife.org",
            "scientificName": "Castor fiber",
            "vernacularNames": {"en": "Eurasian beaver", "nl": "bever"},
        },
        {
            "taxonID": "6MB3T",
            "taxonIDReference": "https://www.catalogueoflife.org",
            "scientificName": "Homo sapiens",
            "vernacularNames": {"en": "human", "nl": "mens"},
        },
        {
            "taxonID": "3Y9VW",
            "taxonIDReference": "https://www.catalogueoflife.org",
            "scientificName": "Martes foina",
            "vernacularNames": {"en": "beech marten", "nl": "steenmarter"},
        },
        {
            "taxonID": "44QYC",
            "taxonIDReference": "https://www.catalogueoflife.org",
            "scientificName": "Mustela putorius",
            "vernacularNames": {"en": "European polecat", "nl": "bunzing"},
        },
        {
            "taxonID": "5BSG3",
            "taxonIDReference": "https://www.catalogueoflife.org",
            "scientificName": "Vulpes vulpes",
            "vernacularNames": {"en": "red fox", "nl": "vos"},
        },
    ],
    "platform": {
        "title": "Agouti",
        "path": "https://agouti.eu",
        "packageID": "3bdce07e-f0b1-4566-948a-e1f0a0d0a214",
    },
    "resources": [
        {
            "name": "deployments",
            "path": "deployments.csv",
            "profile": "tabular-data-resource",
            "format": "csv",
            "mediatype": "text/csv",
            "encoding": "utf-8",
            "schema": {
                "name": "deployments",
                "title": "Deployments",
                "description": "Table with camera trap deployments. Includes `deploymentID`, start, end, location and camera setup information.",
                "fields": [
                    {
                        "name": "deploymentID",
                        "type": "string",
                        "format": "default",
                        "description": "Unique identifier (within a project) of the deployment.",
                        "example": "dep1",
                        "constraints": {"required": True, "unique": True},
                    },
                    {
                        "name": "locationID",
                        "type": "string",
                        "format": "default",
                        "description": "Unique identifier (within a project) of the sampling location for this deployment.",
                        "example": "loc1",
                        "constraints": {"required": False},
                    },
                    {
                        "name": "locationName",
                        "type": "string",
                        "format": "default",
                        "description": "Name given to the sampling location.",
                        "example": "Białowieża MRI 01",
                        "constraints": {"required": False},
                    },
                    {
                        "name": "longitude",
                        "type": "number",
                        "format": "default",
                        "description": "Longitude of the sampling location in decimal degrees, using the WGS84 datum.",
                        "example": "23.84995",
                        "constraints": {
                            "required": True,
                            "minimum": -180,
                            "maximum": 180,
                        },
                    },
                    {
                        "name": "latitude",
                        "type": "number",
                        "format": "default",
                        "description": "Latitude of the sampling location in decimal degrees, using the WGS84 datum.",
                        "example": "52.70442",
                        "constraints": {
                            "required": True,
                            "minimum": -90,
                            "maximum": 90,
                        },
                    },
                    {
                        "name": "coordinateUncertainty",
                        "type": "integer",
                        "format": "default",
                        "description": "Horizontal distance in meters from the given `latitude` and `longitude` describing the smallest circle containing the location of the camera trap. Use for example when coordinates are [rounded](https://en.wikipedia.org/wiki/Decimal_degrees#Precision) to conceal the precise location of an active camera. Term borrowed from [Darwin Core](http://rs.tdwg.org/dwc/terms/coordinateUncertaintyInMeters).",
                        "example": "100",
                        "constraints": {"required": False, "minimum": 1},
                    },
                    {
                        "name": "start",
                        "type": "datetime",
                        "format": "%Y-%m-%dT%H:%M:%S%z",
                        "description": "Date and time when the deployment started, as an ISO 8601 formatted string with timezone designator (`YYYY-MM-DDThh:mm:ssZ` or `YYYY-MM-DDThh:mm:ss±hh:mm`).",
                        "example": "2020-03-01T22:00:00Z",
                        "constraints": {"required": True},
                    },
                    {
                        "name": "end",
                        "type": "datetime",
                        "format": "%Y-%m-%dT%H:%M:%S%z",
                        "description": "Date and time when the deployment ended, as an ISO 8601 formatted string with timezone designator (`YYYY-MM-DDThh:mm:ssZ` or `YYYY-MM-DDThh:mm:ss±hh:mm`).",
                        "example": "2020-04-01T22:00:00Z",
                        "constraints": {"required": True},
                    },
                    {
                        "name": "setupBy",
                        "type": "string",
                        "format": "default",
                        "description": "Name or unique identifier of the person who set up the camera for this deployment.",
                        "example": [
                            "Jim Casaer",
                            "2ef60d48-fd67-4bac-9569-49a03b0ef096",
                        ],
                        "constraints": {"required": False},
                    },
                    {
                        "name": "cameraID",
                        "type": "string",
                        "format": "default",
                        "description": "Unique identifier (within a project) of the camera used for this deployment (e.g. the camera device serial number).",
                        "example": "P800HG08192031",
                        "constraints": {"required": False},
                    },
                    {
                        "name": "cameraModel",
                        "type": "string",
                        "format": "default",
                        "description": "Manufacturer and model of the camera provided in this format: `manufacturer-model`.",
                        "example": "Reconyx-PC800",
                        "constraints": {"required": False},
                    },
                    {
                        "name": "cameraInterval",
                        "type": "integer",
                        "description": "Time specified between shutter triggers when activity in the sensor will not trigger the shutter. Expressed in seconds.",
                        "example": "120",
                        "constraints": {"required": False, "minimum": 0},
                    },
                    {
                        "name": "cameraHeight",
                        "type": "number",
                        "description": "Height at which the camera was deployed. Expressed in meters.",
                        "example": "1.2",
                        "constraints": {"required": False, "minimum": 0},
                    },
                    {
                        "name": "cameraTilt",
                        "type": "integer",
                        "format": "default",
                        "description": "Angle at which the camera was deployed in the vertical plane. Expressed in degrees, with -90 facing down, 0 horizontal and 90 facing up.",
                        "example": "-90",
                        "constraints": {
                            "required": False,
                            "minimum": -90,
                            "maximum": 90,
                        },
                    },
                    {
                        "name": "cameraHeading",
                        "type": "integer",
                        "format": "default",
                        "description": "Angle at which the camera was deployed in the horizontal plane. Expressed in decimal degrees clockwise from north, with values ranging from 0 to 360: 0 = north, 90 = east, 180 = south, 270 = west.",
                        "example": "225",
                        "constraints": {
                            "required": False,
                            "minimum": 0,
                            "maximum": 360,
                        },
                    },
                    {
                        "name": "timestampIssues",
                        "type": "boolean",
                        "description": "`true` if timestamps for this deployment in `media.csv` and `observations.csv` are known to have (unsolvable) issues (e.g. unknown timezone, am/pm switch).",
                        "example": "false",
                        "constraints": {"required": False},
                    },
                    {
                        "name": "baitUse",
                        "type": "string",
                        "format": "default",
                        "description": "Type of bait (if any) that was used for this deployment. If other, more info can be provided in the `comments` field.",
                        "example": "food",
                        "constraints": {
                            "required": False,
                            "enum": [
                                "none",
                                "scent",
                                "food",
                                "visual",
                                "acoustic",
                                "other",
                            ],
                        },
                    },
                    {
                        "name": "session",
                        "type": "string",
                        "format": "default",
                        "description": "Temporal deployment group. Common sessions are seasons (wet and dry), months, years or other logical groupings when sampling occurred. For groupings without context, use `tags`.",
                        "example": "winter 2020",
                        "constraints": {"required": False},
                    },
                    {
                        "name": "array",
                        "type": "string",
                        "format": "default",
                        "description": "Spatial deployment group. Common arrays are grids, arrays, clusters or other logical groupings where sampling occurred. For groupings without context, use `tags`.",
                        "example": "grid A1",
                        "constraints": {"required": False},
                    },
                    {
                        "name": "featureType",
                        "type": "string",
                        "format": "default",
                        "description": "Type of feature (if any) that camera deployment is associated with. If other, more info can be provided in the `comments` field.",
                        "example": "",
                        "constraints": {
                            "required": False,
                            "enum": [
                                "none",
                                "road paved",
                                "road dirt",
                                "trail hiking",
                                "trail game",
                                "road underpass",
                                "road overpass",
                                "road bridge",
                                "culvert",
                                "burrow",
                                "nest site",
                                "carcass",
                                "water source",
                                "fruiting tree",
                                "other",
                            ],
                        },
                    },
                    {
                        "name": "habitat",
                        "type": "string",
                        "format": "default",
                        "description": "Short characterization of the habitat.",
                        "example": "Mixed temperate low-land forest",
                        "constraints": {"required": False},
                    },
                    {
                        "name": "tags",
                        "type": "string",
                        "format": "default",
                        "description": "User defined tags associated with the deployment, as a pipe (`|`) separated list.",
                        "example": "Outside NP | Forest edge",
                        "constraints": {"required": False},
                    },
                    {
                        "name": "comments",
                        "type": "string",
                        "format": "default",
                        "description": "Comments or notes about the deployment.",
                        "example": "",
                        "constraints": {"required": False},
                    },
                    {
                        "name": "_id",
                        "type": "string",
                        "format": "default",
                        "description": "Internal attribute of data management system: ID of this deployment.",
                        "example": "",
                        "constraints": {"required": False},
                    },
                ],
                "missingValues": ["", "NaN", "nan"],
                "primaryKey": "deploymentID",
            },
        },
        {
            "name": "media",
            "path": "media.csv",
            "profile": "tabular-data-resource",
            "format": "csv",
            "mediatype": "text/csv",
            "encoding": "utf-8",
            "schema": {
                "name": "media",
                "title": "Media",
                "description": "Table with media files (images/videos) captured by the camera traps. Associated with deployments (`deploymentID`) and organized in sequences (`sequenceID`). Includes timestamp and file path.",
                "fields": [
                    {
                        "name": "mediaID",
                        "type": "string",
                        "format": "default",
                        "description": "Unique identifier (within a project) of the media file.",
                        "example": "m1",
                        "constraints": {"required": True, "unique": True},
                    },
                    {
                        "name": "deploymentID",
                        "type": "string",
                        "format": "default",
                        "description": "Unique identifier of the deployment this observation belongs to. Foreign key to `deployment:deploymentID`.",
                        "example": "dep1",
                        "constraints": {"required": True},
                    },
                    {
                        "name": "sequenceID",
                        "type": "string",
                        "format": "default",
                        "description": "Unique identifier (within a project) of the sequence this media file belongs to. Sequences contain one or more media files (e.g. a single image or video or a sequence of successive images or videos) and are defined by `sequenceInterval` in the data package metadata.",
                        "example": "seq1",
                        "constraints": {"required": True},
                    },
                    {
                        "name": "captureMethod",
                        "type": "string",
                        "format": "default",
                        "description": "Method used to capture this media file.",
                        "example": "motion detection",
                        "constraints": {
                            "required": False,
                            "enum": ["motion detection", "time lapse"],
                        },
                    },
                    {
                        "name": "timestamp",
                        "type": "datetime",
                        "format": "%Y-%m-%dT%H:%M:%S%z",
                        "description": "Date and time when the media file was recorded, as an ISO 8601 formatted string with timezone designator (`YYYY-MM-DDThh:mm:ssZ` or `YYYY-MM-DDThh:mm:ss±hh:mm`).",
                        "example": "2020-03-24T11:21:46Z",
                        "constraints": {"required": True},
                    },
                    {
                        "name": "filePath",
                        "type": "string",
                        "format": "default",
                        "description": "Url or relative path to the media file, respectively for externally hosted files or files that are part of this data package.",
                        "example": [
                            "https://trapper.org/storage/resource/media/259024/file/",
                            "gs://wildlife_insights/Project/Images/CT-011/IMG0001.jpg",
                            "DEP0001/IMG0001.jpg",
                        ],
                        "constraints": {
                            "required": True,
                            "pattern": "^(?=^[^./~])(^((?!\\.{2}).)*$).*$",
                        },
                    },
                    {
                        "name": "fileName",
                        "type": "string",
                        "format": "default",
                        "description": "Name of a media file. When this field is included, one should be able to sort media chronologically within a deployment on `timestamp` (first) and `fileName` (second).",
                        "example": "IMG0001.jpg",
                        "constraints": {"required": False},
                    },
                    {
                        "name": "fileMediatype",
                        "type": "string",
                        "format": "default",
                        "description": "Mediatype of a media file.",
                        "example": "image/jpeg",
                        "constraints": {"required": True},
                    },
                    {
                        "name": "exifData",
                        "type": "object",
                        "format": "default",
                        "description": "EXIF data of the file, as a valid JSON object.",
                        "example": '{ "EXIF": { "ISO": 200, "Make": "RECONYX"}',
                        "constraints": {"required": False},
                    },
                    {
                        "name": "favourite",
                        "type": "boolean",
                        "format": "default",
                        "description": "`true` if media file is deemed of interest, e.g. an exemplar photo of an individual.",
                        "example": "true",
                        "constraints": {"required": False},
                    },
                    {
                        "name": "comments",
                        "type": "string",
                        "format": "default",
                        "description": "Comments or notes about the media file.",
                        "example": "corrupted file",
                        "constraints": {"required": False},
                    },
                    {
                        "name": "_id",
                        "type": "string",
                        "format": "default",
                        "description": "Internal attribute of data management system: ID of this media file.",
                        "example": "",
                        "constraints": {"required": False},
                    },
                ],
                "missingValues": ["", "NaN", "nan"],
                "primaryKey": "mediaID",
                "foreignKeys": [
                    {
                        "fields": "deploymentID",
                        "reference": {
                            "resource": "deployments",
                            "fields": "deploymentID",
                        },
                    }
                ],
            },
        },
        {
            "name": "observations",
            "path": "observations.csv",
            "profile": "tabular-data-resource",
            "format": "csv",
            "mediatype": "text/csv",
            "encoding": "utf-8",
            "schema": {
                "name": "observations",
                "title": "Observations",
                "description": "Table with observations based on the media files. Associated with deployments (`deploymentID`), sequences (`sequenceID`) and optionally individual media files (`mediaID`). Observations can mark non-animal events (camera setup, human, blank) or one or more animal observations (`observationType` = `animal`) of a certain taxon, count, life stage, sex, behaviour and/or individual.",
                "fields": [
                    {
                        "name": "observationID",
                        "type": "string",
                        "format": "default",
                        "description": "Unique identifier (within a project) of the observation.",
                        "example": "obs1",
                        "constraints": {"required": True, "unique": True},
                    },
                    {
                        "name": "deploymentID",
                        "type": "string",
                        "format": "default",
                        "description": "Unique identifier of the deployment this observation belongs to. Foreign key to `deployment:deploymentID`.",
                        "example": "dep1",
                        "constraints": {"required": True},
                    },
                    {
                        "name": "sequenceID",
                        "type": "string",
                        "format": "default",
                        "description": "Unique identifier of the sequence (collection of media files grouped by a predefined `sequenceInterval`) that is the source of this observation. Foreign key to `media:sequenceID`.",
                        "example": "seq1",
                        "constraints": {"required": True},
                    },
                    {
                        "name": "mediaID",
                        "type": "string",
                        "format": "default",
                        "description": "Unique identifier of the media file that is the source of this observation. Foreign key to `media:mediaID`. Include but leave empty for sequence-based observations.",
                        "example": "m1",
                        "constraints": {"required": False},
                    },
                    {
                        "name": "timestamp",
                        "type": "datetime",
                        "format": "%Y-%m-%dT%H:%M:%S%z",
                        "description": "Date and time of the observation, as an ISO 8601 formatted string with timezone designator (`YYYY-MM-DDThh:mm:ssZ` or `YYYY-MM-DDThh:mm:ss±hh:mm`). For file-based observations this is the `timestamp` of the associated media file (in `mediaID`), for sequence-based observations the `timestamp` of the first media file in the associated sequence (in `sequenceID`).",
                        "example": "2020-03-24T11:21:46Z",
                        "constraints": {"required": True},
                    },
                    {
                        "name": "observationType",
                        "type": "string",
                        "format": "default",
                        "description": "Type of observation. All categories in this vocabulary have to be understandable from an AI point of view. `unknown` describes classifications with a confidence level below some predefined threshold i.e. neither humans nor AI can say what was recorded.",
                        "example": "animal",
                        "constraints": {
                            "required": True,
                            "enum": [
                                "animal",
                                "human",
                                "vehicle",
                                "blank",
                                "unknown",
                                "unclassified",
                            ],
                        },
                    },
                    {
                        "name": "cameraSetup",
                        "type": "boolean",
                        "format": "default",
                        "description": "`true` if this observation is part of the camera setup process (camera deployment, pickup, maintenance).",
                        "example": "false",
                        "constraints": {"required": False},
                    },
                    {
                        "name": "taxonID",
                        "type": "string",
                        "format": "default",
                        "description": "Unique identifier of the `scientificName` according to the taxonomic reference list defined by `taxonIDReference` in the data package metadata.",
                        "example": "QLXL",
                        "constraints": {"required": False},
                    },
                    {
                        "name": "scientificName",
                        "type": "string",
                        "format": "default",
                        "description": "Scientific name of the observed individual(s).",
                        "example": "Canis lupus",
                        "constraints": {"required": False},
                    },
                    {
                        "name": "count",
                        "type": "integer",
                        "format": "default",
                        "description": "Number of observed individuals (optionally of given life stage, sex and behaviour).",
                        "example": "5",
                        "constraints": {"required": False, "minimum": 1},
                    },
                    {
                        "name": "countNew",
                        "type": "integer",
                        "format": "default",
                        "description": "Number of new (= previously uncounted) individuals in the associated media file (`mediaID`) taking into account the entire sequence (`sequenceID`) (optionally of given life stage, sex and behaviour).",
                        "example": "2",
                        "constraints": {"required": False, "minimum": 0},
                    },
                    {
                        "name": "lifeStage",
                        "type": "string",
                        "format": "default",
                        "description": "Age class or life stage of observed individual(s). Term borrowed from [Darwin Core](http://rs.tdwg.org/dwc/terms/lifeStage).",
                        "example": "adult",
                        "constraints": {
                            "required": False,
                            "enum": [
                                "adult",
                                "subadult",
                                "juvenile",
                                "offspring",
                                "unknown",
                            ],
                        },
                    },
                    {
                        "name": "sex",
                        "type": "string",
                        "format": "default",
                        "description": "Sex of observed individual(s)",
                        "example": "female",
                        "constraints": {
                            "required": False,
                            "enum": ["female", "male", "unknown"],
                        },
                    },
                    {
                        "name": "behaviour",
                        "type": "string",
                        "format": "default",
                        "description": "Dominant behaviour of observed individual(s), ideally expressed as controlled values (e.g. grazing, browsing, rooting, vigilance, running, walking). A combination of dominant (first) and additional behaviours can be expressed as as a pipe (`|`) separated list.",
                        "example": "vigilance",
                        "constraints": {"required": False},
                    },
                    {
                        "name": "individualID",
                        "type": "string",
                        "format": "default",
                        "description": "Unique identifier (within a project) of the observed individual.",
                        "example": "RD213",
                        "constraints": {"required": False},
                    },
                    {
                        "name": "classificationMethod",
                        "type": "string",
                        "format": "default",
                        "description": "Classification method.",
                        "example": "human",
                        "constraints": {
                            "required": False,
                            "enum": ["human", "machine"],
                        },
                    },
                    {
                        "name": "classifiedBy",
                        "type": "string",
                        "format": "default",
                        "description": "Name or unique identifier of the person or AI algorithm that classified this observation.",
                        "example": ["Jakub Bubnicki", "Megadetector"],
                        "constraints": {"required": False},
                    },
                    {
                        "name": "classificationTimestamp",
                        "type": "datetime",
                        "format": "%Y-%m-%dT%H:%M:%S%z",
                        "description": "Date and time of the classification, as an ISO 8601 formatted string with timezone designator (`YYYY-MM-DDThh:mm:ssZ` or `YYYY-MM-DDThh:mm:ss±hh:mm`).",
                        "example": "2020-08-22T10:25:19Z",
                        "constraints": {"required": False},
                    },
                    {
                        "name": "classificationConfidence",
                        "type": "number",
                        "format": "default",
                        "description": "Accuracy confidence of the classification. Expressed as a probability, with `1` being maximum confidence. Provide an approximate value for human classifications.",
                        "example": "0.95",
                        "constraints": {"required": False, "minimum": 0, "maximum": 1},
                    },
                    {
                        "name": "comments",
                        "type": "string",
                        "format": "default",
                        "description": "Comments or notes about the observation.",
                        "example": "",
                        "constraints": {"required": False},
                    },
                    {
                        "name": "_id",
                        "type": "string",
                        "format": "default",
                        "description": "Internal attribute of data management system: ID of this observation.",
                        "example": "",
                        "constraints": {"required": False},
                    },
                ],
                "missingValues": ["", "NaN", "nan"],
                "primaryKey": "observationID",
                "foreignKeys": [
                    {
                        "fields": "deploymentID",
                        "reference": {
                            "resource": "deployments",
                            "fields": "deploymentID",
                        },
                    },
                    {
                        "fields": "sequenceID",
                        "reference": {"resource": "media", "fields": "sequenceID"},
                    },
                    {
                        "fields": "mediaID",
                        "reference": {"resource": "media", "fields": "mediaID"},
                    },
                ],
            },
        },
    ],
}

Thanks for your help!

Please preserve this line to notify @roll (lead of this repository)

roll commented 3 years ago

Hi @niconoe,

The Report object has the following structure:

valid: # global valid
errors: [] # global errors
tasks: 
  - valid: # task valid
    errors: [] # task errors

So in your case there are not global errors like "Invalid Data Package's Metadata" but there are errors in data package's resources (metadata or data).

https://framework.frictionlessdata.io/docs/guides/validation-guide#validation-report

Please let me know if it didn't help you

niconoe commented 3 years ago

Okay thanks, two subsequent questions:

roll commented 3 years ago

@niconoe Sure if there are any resource's errors the top-level valid will be False

I think we need to polish the terminology as the word "global" might be confusing in this context. Here is a better explanation:

niconoe commented 3 years ago

Thanks @roll, that's much more clear for me now!

If I may add a suggestion: I think it would be great if:

1) all that information appeared in the Report API documentation (so future users won't bother you here like I'm doing right now) 2) as you said, the terminology could also be a bit more polished, that could make things a bit more intuitive for newcomers. For example, I guess things might a bit more clear if report.errors was renamed to something a bit more explicit such as report.metadata_errors or report.global_errors for example.

Thanks again for the help and clarifications!