FamilySearch / GEDCOM

Apache License 2.0
171 stars 22 forks source link

Extending SPLAC to include ADDR and PLAC? #536

Open mother10 opened 3 months ago

mother10 commented 3 months ago

Since I started reading GEDCOM, I have always wondered why on some places in the GEDCOM, ADDR had to be used, and on other places PLAC. Inside ADDR we have tags for CITY, STAE, POST and CTRY. But inside PLAC we have jurisdictions that do the same?! On places in GEDCOM, where only PLAC is allowed now, people start using an address as the leftmost, smallest, jurisdiction, because they want to denote an address rather then a Place. Or they use the name of a church as the leftmost jurisdiction.

Now I have seen the proposels for SPLAC (See #520 and #527 ) of which I want to add to 520 "Adding SPLAC beside PLAC", because that is more in line with what I will write here. The comparison in that proposel, was made with NOTE and SNOTE. But I think that should be with REPO and SOUR. Why?

1 (one) Repository can contain many sources. At an event, we can link to a Source, which can link to a Repository.

I wonder if addresses are not the same.

1 (one) City, can have many addresses. So at an event we should have just an address described, and that address should point to a City (SPLAC). (which in turn points to, a state, which ... etc just as in the new spec of GEDCOM) The address at the event should NOT have CITY, STAE, POST and CTRY. Because that way we have kind of the same information on more places in the GEDCOM.

And to maybe make things clearer, we should not call it ADDR, but maybe BUILDING (or an abbreviation of that like BLDNG) BUILDING can be someones home, or a church or a castle, or a university, a Harbour etc.

Because in 1 home, more children can be born, and in 1 church many children can be baptised, or people can have their wedding, BUILDING should also be a record like structure, not a property of something.

In fact, BUILDING is the smallest form of SPLAC. It should have a name or a title so it can be identified for a user.

So where ever it is now allowed to have either ADDR or PLAC, we will now have BUILDING there. That just describes an address in user understandable text. And points to the CITY SPLAC entity. I looked at addresses around the world, and that seems way to complicated to "catch" in a specification. So thats why I say, inside the BUILDING, the address is written as complete as the user wants, but that text will only be output by a program (shown to the user), NOT interpreted, it is not used to define where it is on a map, because of the complications. Defining a BUILDING on a map is done by the link to the "first" SPLAC in the chain.

Now because there can be more BUILDINGs in a City, inside BUILDING we can also use MAP with its 2 coördinates to pinpoint its position in a City. If that position is not present, it will be put at the center of the city it belongs to (the default position of the City itself). On top of others that have no further positioning.

By adding MAP to BUILDING, it will now also be possible to denote the birthplace when a child is born on a ship or in a plane, or in a car somewhere, because a hospital could not be reached in time. Same is valid for a Death at sea. So instead of dying in the "Atlantic Ocean", which is so huge we have no idea where that might have been happening, we maybe are able to figure out from the route a ship took, and the date of the death, where it might have been and show a bit better in that immense ocean where it happened. So BUILDING could also be a Plane or a Boat.

To me I think BUILDING should also have a NOTE structure and a SOURCE citation.

And SPLAC would have a CHAN and a CREA.

Area's: I remember, to have seen, I think Luther, talking about people wanting an area to denote an event. That is also possible, we define RADIUS under MAP, and give meters, or kilometers ar anything, and we have a circle with a center, denoting the area where things took place. 1 MAP 2 LATI N18.150944 2 LONG E168.150944 2 RADIUS 5.4KM

Instead of RADIUS we could also have SQUARE or RECT like: 1 MAP 2 LATI N18.150944 2 LONG E168.150944 2 RECT X:+5.7KM Y:-5.7KM Depending on the + or - the Coordinates denote which corner it is. (Both +, the corner is the left bottom, both minus, that corner is the top right)

But to me RADIUS seems easier.

My guess is this will also reduce GEDCOM size, as a lot of "doubled text" (from all the ADDR's that are in fact the same) will be removed and move into the corresponding SPLAC records.

If BUILDING has a MAP structure and SPLAC too, the MAP of BUILDING should be used to show on a map, as thats more precise. The MAP of a City points to an arbitrary point inside the City. A default point, in case the BUILDINGs pointing to that City, have no Map structure.

In the SPLAC beside PLAC md file, I think there is an SPLAC record missing under RECORD := For the TYPE I would choose to have everything in Uppercase, so CITY, COUNTY, STATE, COUNTRY I think it might be more clear if there also was an example with a real placename.

Maybe, in case of the above example for an Ocean, have a TYPE OCEAN too? And maybe a Type AIR? These last 2 Types have no other SPLACs that they point to, or that point to those I presume.

Some other thing about SPLAC: Now it has coördinates. But what if there would be more ways of denoting a place then just LAT and LONG and an Address. Then it might need a TYPE to tell what mapping system is used. And maybe more then 1 mapping system can be used for 1 SPLAC? There seem to be other systems like What3words, UTM coding, Plus Codes, and maybe more. But that does not seem very common yet?

I am sure I forgot things, but I wanted to add it here, to maybe inspire someone.

Norwegian-Sardines commented 1 month ago

Dirk: I had hoped that the comma-separated location entities would be replaced by SPLAC records. Your answers after that put them into perspective again (from your side, no claim to completeness). In your proposal, the comma-separated elements make it seem as if nothing has changed compared to older GEDCOM versions, so I would appreciate it if examples such as... Reply: To be honest, I had hoped that a hierarchical system of SPLAC records could have been implemented, but I saw too many problems with this implementation. I’m hoping that my interaction with Albert sheds light on why I think a record-based hierarchy is not possible.


Dirk: How is a beginner to know that he has to enter comma-separated elements in ascending order in his software? Reply: All of the software I have dealt with to this point tell its users how to enter Place information in their help text. These applications already understand how to help their user base enter the data. Many users never even think or care about GEDCOM, but if they do the software currently creates a GEDCOM with comma separated places from their input. We are creating a protocol for transmission of data, the application can ask for data in any form they want, it is the application’s job to understand GEDCOM and create the output as The Standard dictates.


Dirk: It should be clearer which of the place texts would then be used for output. Especially with extensive list output, a place does not have to be repeated each time with all its place entities. Reply: My hope is that applications would use the ABBR(abbreviation) payload from the primary PLACpayload as their display value. A SPLAC record has a SPLAC.PLACpayload and a SPLAC.PLAC.ABBRpayload, the Paris Texas SPLAC record could look like this:

0 @P1@ SPLAC
1 PLAC Paris, Lamar County, Texas, USA
2 ABBR Paris Texas
2 DATE AFT 1845
1 PLAC Paris, Red River County, Republic of Texas
2 ABBR Paris, Republic of Texas
2 DATE BEF 1840
1 PLAC Paris, Lamar County, Republic of Texas
2 ABBR Paris Texas
2 DATE FROM 1840 TO 1845

NOTE: This record includes multiple Place Descriptions based on the history of the city over time. Only the first PLACpayload is required. A report created from this record could either display the SPLAC.PLACpayload or the SPLAC.PLAC.ABBRpayload from the primary (first) PLAC However, if a relative was born in Paris before Texas became part of the USA, the history may be important to a user and they wanted to display “Paris Republic of Texas” rather than “Paris USA”. This example outlines one issue with the GEDCOM-L multiple level SPLAC design. An SPLAC record would need to be created for each entity presented. 1) Paris, 2) Lamar County, 3) Texas, 4) USA, 5) Red River County, 6) Republic of Texas

Paris would need two date-based links to superior levels 1) Lamar County 2) Red River County Lamar County would need two date-based links to superior levels 1) Republic of Texas 2) Texas

The second issue is that the Record Instance of “Paris” has no identifier in it to indicate that it is the Paris that will eventually get us into a Texas land mass. Paris Ohio would start with Paris at the lowest level too!


Dirk: I would not see the new SPLAC structure as a replacement for ADDR and would rather enter “Highland Cemetery, John Doe Grave” as ADDR. For me, ADDR are therefore not always mandatory postal addresses. Reply: I do see the place of John Doe’s grave as just another place entity. I map them with an exact GPS locator, I can’t do that with an Address that does not have an SPLAC record associated with it! This also holds true for any place that I can map with a GPS locator. You may not do this, and this is fine, but I think readers of my website or book would like to be able to see where in the world a person was born, lived, died, and not just the closest city!


Dirk: The software can then generate an additional glossary page in which, among other things, “Hmbg = Hamburg” can be found as an explanation. Or have I misinterpreted ABBRhere? Reply: Either interpretation can be used although I don’t think abbreviating Hamburg is valuable, but to each his own. In my example above the Abbreviation (ABBR) could be “Paris TX” but how many of my readers in Norway, Germany, England and other places know what TX means? I would not know what Hmbg means, could it be “Humbug”?


Dirk: I don't particularly like the combination of PLACand DATE to document historical name changes of the place (example “Oslo”). What should my software do with it? What can I do with the information as a user? “Olso Fylke” would confuse me as a user. I don't speak Norwegian. Is this a two-part place name or a street name? Is this PLAC/DATE listing correct at all and where does it come from (don't you also have to insert a SOUR?)? I could therefore do without DATE information. Reply: What would you do differently to document and maintain historical place names? I’m open for suggestions. A relative of mine could have been born in Christiania, Norway, but died in Oslo, Norway. I would need to know that these are the same place so I can link to the same SPLAC record instance for both events, otherwise a user of my site could want to create a new SPLAC record instance. The DATE payload only helps to create a list of SPLAC records that could be the place of the event, the data entry person would pick from the list the proper Place Description and Time Frame, for example Christiania in 1850. Oslo Fylke is the next highest district from the City of Oslo, in English it could be classified as a “county” but if I use that term in the USA they would not understand, and a Norwegian might correct me.

albertemmerich commented 1 month ago

Are we mixing now GEDCOM data transfer and internal application processes? In my application the search for "Paris" gives 34 results, listed to select. If I select the Paris in Lamar County, Texas, with on click the place records for the city of Paris, the county Lamar, and the state Texas are created (if not yet in the database), and linked as Paris => Lamar => Texas. The application has 440 default locations linked to the Lamar County. The user can add any other location he finds in the sources within Lamar County, and link it, too. So it is straight forward to look for all events of individuals / families which took place in Lamar County or any of the locations linked to Lamar County. This part uses the linkage of locations to Lamar County (including linkages with more steps, so buildings, cemeteries etc are included). An individual buried at Baxterville Cemetery will be found by this search as well as any individual born, baptised, etc in Paris. So for me: yes, creating the first SPLAC record in any place in Lamar County will create the Lamar County record, too. However, the next SPLAC record of another place im Lamar County already can use the Lamar County record, and link to it. This includes all places in a level under cities, too. So I can create a SPLAC record for any farm, link it to a village in Lamar County or directly to the county. If I had done this, and transfer the data to next database (maybe next application) I do not want to create these links again and again.

Norwegian-Sardines commented 1 month ago

How many of those links/events are in

The application has 440 default locations linked to the Lamar County.

Could some of them be in Lamar County, Mississippi?

Norwegian-Sardines commented 1 month ago

So for me: yes, creating the first SPLAC record in any place in Lamar County will create the Lamar County record, too. However, the next SPLAC record of another place im Lamar County already can use the Lamar County record, and link to it.

My point here is that Lamar County (and potentially any place entity) is not unique within the GOV database, it exists within the context of its higher place entities. In this case the State of Texas, the State of Mississippi, or the State of Georgia, So using the record instance named "Lamar" or "Lamar County" as the entry point to find all events that occurred in that county must also identify the State within the United States. And of course, as stated previously, Lamar was also in the Country of "The Republic of Texas". So this is why the hierarchy of different record instances has issues, and they are compounded if you want to include historical place names, place names by language, and place names that are not unique in the world due to their contextual connections to other place entities.

albertemmerich commented 1 month ago

Lamar County in Texas has 133 locations in the default database. You are correct, we have to pick this SPLAC record for the County in Texas when searching. My application does it automatically, if you choose it as parent SPLAC record. If you only look for the name "Lamar County", you get the places in other states of the US, too. But that is shown in the result, as the search result will list the name of location, its type, its county, its state, and the country as well as the GPS coordinates.

By the way, do you know an external database which shows all the historic links in US? GOV could do that, if somebody enters the data. GOV is based on the community of researchers cooperating to enter all these data...

Norwegian-Sardines commented 1 month ago

By the way, do you know an external database which shows all the historic links in US? GOV could do that, if somebody enters the data.

No I don’t know of an external DB of historical links for the USA. Sorry, I have enough to do, can’t use up my time entering data for other people!