FamilySearch / GEDCOM

Apache License 2.0
163 stars 21 forks source link

Add note about empty place levels #495

Open dthaler opened 3 months ago

dthaler commented 3 months ago

The discussion of <g7:PLAC> says:

The principal place in which the superstructure’s subject occurred, represented as a List of jurisdictional entities in a sequence from the lowest to the highest jurisdiction. As with other lists, the jurisdictions are separated by commas. Any jurisdiction’s name that is missing is still accounted for by an empty string in the list.

The type of each jurisdiction is given in the PLAC.FORM substructure, if present, or in the HEAD.PLAC.FORM structure. If neither is present, the jurisdictional types are unspecified beyond the lowest-to-highest order noted above.

The sentence "Any jurisdiction’s name that is missing is still accounted for by an empty string in the list." is somewhat counter-intuitive. That is, the reader wonders: why would one not just remove that level from PLAC.FORM and omit the empty level?

For example, why have:

3 PLAC , Oneida, Idaho, USA
4 FORM City, County, State, Country

instead of:

3 PLAC Oneida, Idaho, USA
4 FORM County, State, Country

which certainly makes for smaller files? This seems to warrant a note at least.

Some possible cases to consider:

In short, the spec contains no rationale for why having an empty level at the beginning of a PLAC would be good behavior. Perhaps making no recommendation either way is ok, but I think at least a note would improve clarity. Maybe a technical FAQ discussion with several examples might also be appropriate.

Other examples to consider:

2 PLAC Virginia Beach, Virginia, USA
3 FORM City, State, Country

vs

2 PLAC Virginia Beach, , Virginia, USA
3 FORM City, County, State, Country
2 PLAC Salem
3 FORM City

vs

2 PLAC Salem, , ,
3 FORM City, County, State, Country
albertemmerich commented 3 months ago

My observation is: Many applications default to city, county, state, country and ignore any FORM (from HEAD and from the superstructure PLAC). This said

2 PLAC Virginia Beach, Virginia, USA

will show a county Virginia, and state USA in those applications. This will not happen for the often used version

2 PLAC Virginia Beach, , Virginia, USA

As long as I can describe a place by the four jurisdictions city, county, state, country, I would prefer to use the empty value for a missing / unknown jurisdiction, and not give a FORM in every call of this place.

Norwegian-Sardines commented 3 months ago

Personally, I think PLAC.FORM is an outdated and little used concept. The PLAC tag values can indicate the value is a township, county, or other indicator if necessary and with modern mapping software even with missing level values can still be used to locate a majority of places. And as I’ve said elsewhere, a place does not have to include just city, state, country, but cemeteries, hospitals, house addresses, so a place can be located on a map.

dthaler commented 3 months ago

My observation is: Many applications default to city, county, state, country and ignore any FORM (from HEAD and from the superstructure PLAC).

What do said applications do when they get PLAC payloads with a larger number of levels, such as 6?

dthaler commented 3 months ago

Personally, I think PLAC.FORM is an outdated and little used concept. The PLAC tag values can indicate the value is a township, county, or other indicator if necessary and with modern mapping software even with missing level values can still be used to locate a majority of places. And as I’ve said elsewhere, a place does not have to include just city, state, country, but cemeteries, hospitals, house addresses, so a place can be located on a map.

I suspect PLAC.FORM was added to provide labels for use in a user interface, next to the values of the levels. So if you have a PLAC payload with, say, 6 labels, then a UI that supports at least that many separate fields could get the labels from FORM. My own app does not use such labels, but some apps may want to have such a UI.

tychonievich commented 3 months ago

Discussed in steering committee 2 JULY 2024

fisharebest commented 3 months ago

FWIW, my own application ignores empty place levels.

Firstly, when navigating place hierarchies and maps, it becomes necessary to create dummy (unknown) labels for the missing levels. This leads to a confusing UI.

Secondly, like NULL in databases, missing levels are not equal. e.g. these two places are different, because the missing level could be different.

For the same reason, I also ignore PLAC.FORM.

albertemmerich commented 3 months ago

The way of describing places (including buildings, cemetaries, churches, parishes etc) is outdated in GEDCOM. There is great demand to switch to a structure of hierarchical objects. GEDCOM 7.0 and its PLAC, PLAC.FORM structure fails to fulfill the requirements:

All of these conditions will result in varying PLAC payloads for the same place object, and GEDCOM 7.0 does not provide a way to show that these payloads describe the same place. A workaround may be to use EXID ans EXID.TYPE to point to an external database like the Historic Gazetteer (gov.genealogy.net). Doing this, the hierarchy of jurisdictions is no longer needed as every object in describes uniquely the place, providing the type of the object, the time dependant jurisdictions of next levels as well as the name in different languages, at differnet times and so on.

Based on these ideas the German GEDCOM-L group has defined an extension of "location records", using one record for every place object and providing all the data known for the object. In GEDCOM 5.5.1 and in GEDCOM 7.0 we cite these location records by

2 PLAC <place payload>
3 _LOC <XREF:_LOC>

By this we still have the structure in GEDCOM files which will be imported by other applications not using the _LOC records, but the place payload may vary for the same place object.

My application uses these _LOC records and the user may identify different PLAC payloads as describing the same object. The payloads can be kept as is, only identifying the object by assigning the . The application provides the interpretation of PLAC.FORM, too, not restricted to city objects, but including builings, cemetaries, churches and other objects. The observation is, that users use the identifying process of different place payloads merging them in one place object, but do not use the PLAC.FORM structures provided by the application.

As most applications do not interpret the FORM data, and by this the transfer of place data in between applications is based on the PLAC payload only, we should not modify any elements in the GEDCOM 7.x spec for places which would require modifications of the applications.

I propose to switch to an location record system in next major GEDCOM version, and do this as soon as possible.