datasets / publicbodies

A database of public bodies such as government departments, ministries etc.
http://publicbodies.org
MIT License
64 stars 28 forks source link

Decide whether or not organizational units are in scope #61

Open augusto-herrmann opened 10 years ago

augusto-herrmann commented 10 years ago

Are in scope of the data for this project:

a) only organizations (as in org:Organization ); or b) organization and their respective hierarchy of organizational units (as in org:OrganizationalUnit )?

augusto-herrmann commented 10 years ago

Some countries' data (e.g. Switzerland) already seem to include organizational units.

If it's in scope I would also include this data from Brazil, as it is available.

rufuspollock commented 10 years ago

@augusto-herrmann @hannesgassert I think it would be good to add these but we should agree the column name and meaning and add to datapackage.json first ...

augusto-herrmann commented 9 years ago

How about this?

          {
            "id": "type",
            "type": "string",
            "description": "Type of entry: 'o' for organization level, 'ou' for organizational unit level"
          },
rufuspollock commented 9 years ago

@augusto-herrmann seems sensible though I dislike "type" as it is so overloaded. Perhaps "organizationType" or "organization-level" might be better.

/cc @hannesgassert

augusto-herrmann commented 9 years ago

Agreed.

But we should use "organization_level" (with an underscore) in order to be consistent with the word separation scheme used in the rest of the column names.

augusto-herrmann commented 8 years ago

If no one opposes this change to the data model, I should add this soon-ish.

Existing data should be updated with the new organization_level column and their respective cells kept blank until they can be filled in from official sources.

todrobbins commented 8 years ago

I think we should be verbose in the values and list the organization_level as Organization or Organizational Unit.

Or another proposal would be:

augusto-herrmann commented 3 years ago

Here's another idea: add organizational units in a different CSV file.

Organizational units can be very numerous, around several thousands for each country. They also tend to be updated in structure much more often. Putting them in a separate file will make downloading easier for people who are only interested in the main organizations. It would also be possible to have a different update schedule for them.

The main organization would remain where they are, at /data. The complete file with main organization and units could all be put in a subfolder named /data/organizational_units, so we would have a new folder with CSV files with the same names and the same schema as the main ones, but much larger.

augusto-herrmann commented 3 years ago

That file would contain the full structure of government down to the smallest internal unit. This data tends to get very large very quickly and update very frequently.

We recently started publishing a daily csv of this for Brazil, and it's a 124 MB file. That is not so large, but to keep track of its changes in Git it may make the repository a lot slower and unwieldly.

I'm open to discussing other alternatives. Or whether or not it is really ok to store a file as large as this, frequently updated, in a Git repo.

Your thoughts, @todrobbins, @rufuspollock, @hannesgassert?