FamilySearch / GEDCOM

Apache License 2.0
160 stars 20 forks source link

Extending SPLAC to include ADDR and PLAC? #536

Open mother10 opened 1 month ago

mother10 commented 1 month ago

Since I started reading GEDCOM, I have always wondered why on some places in the GEDCOM, ADDR had to be used, and on other places PLAC. Inside ADDR we have tags for CITY, STAE, POST and CTRY. But inside PLAC we have jurisdictions that do the same?! On places in GEDCOM, where only PLAC is allowed now, people start using an address as the leftmost, smallest, jurisdiction, because they want to denote an address rather then a Place. Or they use the name of a church as the leftmost jurisdiction.

Now I have seen the proposels for SPLAC (See #520 and #527 ) of which I want to add to 520 "Adding SPLAC beside PLAC", because that is more in line with what I will write here. The comparison in that proposel, was made with NOTE and SNOTE. But I think that should be with REPO and SOUR. Why?

1 (one) Repository can contain many sources. At an event, we can link to a Source, which can link to a Repository.

I wonder if addresses are not the same.

1 (one) City, can have many addresses. So at an event we should have just an address described, and that address should point to a City (SPLAC). (which in turn points to, a state, which ... etc just as in the new spec of GEDCOM) The address at the event should NOT have CITY, STAE, POST and CTRY. Because that way we have kind of the same information on more places in the GEDCOM.

And to maybe make things clearer, we should not call it ADDR, but maybe BUILDING (or an abbreviation of that like BLDNG) BUILDING can be someones home, or a church or a castle, or a university, a Harbour etc.

Because in 1 home, more children can be born, and in 1 church many children can be baptised, or people can have their wedding, BUILDING should also be a record like structure, not a property of something.

In fact, BUILDING is the smallest form of SPLAC. It should have a name or a title so it can be identified for a user.

So where ever it is now allowed to have either ADDR or PLAC, we will now have BUILDING there. That just describes an address in user understandable text. And points to the CITY SPLAC entity. I looked at addresses around the world, and that seems way to complicated to "catch" in a specification. So thats why I say, inside the BUILDING, the address is written as complete as the user wants, but that text will only be output by a program (shown to the user), NOT interpreted, it is not used to define where it is on a map, because of the complications. Defining a BUILDING on a map is done by the link to the "first" SPLAC in the chain.

Now because there can be more BUILDINGs in a City, inside BUILDING we can also use MAP with its 2 coördinates to pinpoint its position in a City. If that position is not present, it will be put at the center of the city it belongs to (the default position of the City itself). On top of others that have no further positioning.

By adding MAP to BUILDING, it will now also be possible to denote the birthplace when a child is born on a ship or in a plane, or in a car somewhere, because a hospital could not be reached in time. Same is valid for a Death at sea. So instead of dying in the "Atlantic Ocean", which is so huge we have no idea where that might have been happening, we maybe are able to figure out from the route a ship took, and the date of the death, where it might have been and show a bit better in that immense ocean where it happened. So BUILDING could also be a Plane or a Boat.

To me I think BUILDING should also have a NOTE structure and a SOURCE citation.

And SPLAC would have a CHAN and a CREA.

Area's: I remember, to have seen, I think Luther, talking about people wanting an area to denote an event. That is also possible, we define RADIUS under MAP, and give meters, or kilometers ar anything, and we have a circle with a center, denoting the area where things took place. 1 MAP 2 LATI N18.150944 2 LONG E168.150944 2 RADIUS 5.4KM

Instead of RADIUS we could also have SQUARE or RECT like: 1 MAP 2 LATI N18.150944 2 LONG E168.150944 2 RECT X:+5.7KM Y:-5.7KM Depending on the + or - the Coordinates denote which corner it is. (Both +, the corner is the left bottom, both minus, that corner is the top right)

But to me RADIUS seems easier.

My guess is this will also reduce GEDCOM size, as a lot of "doubled text" (from all the ADDR's that are in fact the same) will be removed and move into the corresponding SPLAC records.

If BUILDING has a MAP structure and SPLAC too, the MAP of BUILDING should be used to show on a map, as thats more precise. The MAP of a City points to an arbitrary point inside the City. A default point, in case the BUILDINGs pointing to that City, have no Map structure.

In the SPLAC beside PLAC md file, I think there is an SPLAC record missing under RECORD := For the TYPE I would choose to have everything in Uppercase, so CITY, COUNTY, STATE, COUNTRY I think it might be more clear if there also was an example with a real placename.

Maybe, in case of the above example for an Ocean, have a TYPE OCEAN too? And maybe a Type AIR? These last 2 Types have no other SPLACs that they point to, or that point to those I presume.

Some other thing about SPLAC: Now it has coördinates. But what if there would be more ways of denoting a place then just LAT and LONG and an Address. Then it might need a TYPE to tell what mapping system is used. And maybe more then 1 mapping system can be used for 1 SPLAC? There seem to be other systems like What3words, UTM coding, Plus Codes, and maybe more. But that does not seem very common yet?

I am sure I forgot things, but I wanted to add it here, to maybe inspire someone.

Norwegian-Sardines commented 1 month ago

In 40 plus years of using GEDCOM I've never used the Address_Structure! As far as I'm concerned it is useless! But to each their own!

If I need to include an address for a place it is just another entry in the PLAC text. Just like if I need to add a grave marker (because I know the exact location of that marker) I put either the word "Grave" or if I know whatever the cemetery uses to locate the grave (i.e. row, section, columbarium number, "At Sea", etc.)

As far as items that are bigger than a point (Y/X coordinates) we should also have a polygon option in a "PLAC" node.

I also want a TEXT tag that can use a markup/markdown language option to create a formated description of the place being saved!

If I was part of the SPLAC committee these are things I would add, but I'm not!!

dthaler commented 3 weeks ago

Discussion in GEDCOM Steering Committee 20 AUG 2024:

tychonievich commented 3 weeks ago

A somewhat related topic: in my opinion, ADDR is more like EXID than it is like PLAC: it's an identifier, usually defined or accepted by some kind of postal system, for a receptical or recipient of letters and packages. I think that this is consistent with the specification for ADDRESS_STRUCTURE's wording "as it would appear on a mailing label", but could be more clear (or could be an incorrect reading). Like @Norwegian-Sardines I rarely use it and would be happy to use detailed PLAC instead as @mother10 proposes, but I also think that there are cases where a postal address identifier is useful (identifying SUBM, REPO, and postal addresses that appear in sources such the address of a lawyer associated with a legal document)

mother10 commented 3 weeks ago

In my tree, since around 1850 or so, there were a kind of "housecards" used. Those cards were for 1 specific house in some suburb or street (depending on the way they did addressing) On those cards is a lot of important info, names, relations of people living there, dates of birth, when they arrived and left and where they came from or went to. So yes I have addresses in my tree, but only from that time and later. So there should be some way of expressing that. But like Luther said, thats more an EXID thing, street (suburb) and housenumber. All the rest can go in SPLAC's.

Further, as I already mentioned, in the forums of the program I use, people asked to have the leftmost of the PLAC jurisdictions to be an address. Probably for the same reason.

So a postal address is not just for today, for lawyers and such.

Norwegian-Sardines commented 3 weeks ago

Tineke,

If I’m reading you correctly, these “housecards” sound like Source_Records and therefore assert names, relations, birth dates, residence and other “facts”.

Just like a census that also asserts facts, I would create a Source_Record for the census (i.e. 1920 US Census) with a date of the enumeration and associate it with all of the facts it asserts including the Residence, I would do the same with your “housecard”. In both cases I add the house and street number to the PLAC tag.

In my application we have extended the application to have additional information about each item in the comma delimited PLAC payload that provides detail about the item/location including images and text/history. Meaning I can view information about the division, address, country etc. of any place in my database!

mother10 commented 3 weeks ago

I understand what you say. But in the Dutch application my data comes from, those are all addresses. Thye are added as "facts" to a person, showing where that person lived with a from date and an end date. So many Dutch people will have it like this as that program is the most widely used program in our country.

But yes, also a SOUR.

albertemmerich commented 3 weeks ago

@Norwegian-Sardines: Let me show why I am using both PLAC and ADDR, even if I have the building in the PLAC hierarchy:

2 PLAC myhouse, myvillage, mycounty, mystate, mycountry
3 FORM building, city, county, state, country
2 ADDR mystreet myhousenumber
3 CONT zip mymunicipality

The PLAC hierarchy is NOT the same as the ADDR hierarchy. I cannot put the ADDR as part of PLAC, as PLAC has no zip code, and as PLAC has the jurisdiction "city", which the address does not have, and municipality is used by the postal address, but not part of PLAC. This is not only theory, in my own postal address you will not find the city, but the municipality. And in the PLAC I do not show the municipality, but the city..

myhouse can be a farm name, or something like that. It refers to the same location as the address does, however the address is like an EXID (see Luther's comment), which - to make it more complicated, may vary from time to time. Some hundreds years ago little villages had no street names, only house numbers. Later the street names were introduced, together with new house numbers. Then the postal code was introduced, however that system has been modified twice since then.

So if we add ADDR to the SPLAC records, we will have an own substructure for the address in the SPLAC record. And that substructure must have a DATE substructure to show the time dependency.

mother10 commented 3 weeks ago

O and the extended information you mention, will now go in the different SPLAC's, that is if they chooses SPLAC not inside PLAC.

Norwegian-Sardines commented 3 weeks ago

but I also think that there are cases where a postal address identifier is useful (identifying SUBM, REPO, and postal addresses that appear in sources such the address of a lawyer associated with a legal document)

In v5.5.1 GEDCOM, without enough fields to capture enough information about an “artifact source” what I’ve done for documents that were created (like a legal document), the author (AUTH) is the lawyer/writer/scribe of the document and any “location based” information would follow the suggestion for unpublished work suggesting that we use the Source_Publication_Facts payload and input the city, state of the writer of the document!

Maybe Source_Record needs a template for other data just like I proposed from creating a Citation_Record!

Norwegian-Sardines commented 3 weeks ago

Albert,

The PLAC hierarchy is NOT the same as the ADDR hierarchy. I cannot put the ADDR as part of PLAC, as PLAC has no zip code, and as PLAC has the jurisdiction "city", which the address does not have, and municipality is used by the postal address, but not part of PLAC. This is not only theory, in my own postal address you will not find the city, but the municipality. And in the PLAC I do not show the municipality, but the city..

I don’t understand the difference between a city and a municipality. In Norway, people who live on a farm live in a Kommune not a city, I put the Kommune name rather than a city in the PLAC hierarchy. In the US we have rural townships, I use these when locating a farm, because city does not make sense!

I agree that you need to have an address and zip code, but what are addresses and zipcodes for? Mailing letters! Personally I think GEDCOM should not be used to create mailing labels, but to each their own!

Norwegian-Sardines commented 3 weeks ago

On the other hand, if you did want to use GEDCOM to maintain an address book, I think putting the Address_Structure inside the Place_Structure is incorrect! I’d rather see it used as a stand alone Fact with a date range rather than as a subtag of all facts, having the zip code and a mailing label address layout for facts like OCCU and CENS makes little sense, most likely you are not going to send a letter (using zip code) to these places!

albertemmerich commented 3 weeks ago

In my region (Germany) we have: building city (the name of village/city where you live) municipality (the lowest administration level) county (next administration level) district (administration level, only in some states) state country

In some parts of Germany we have another administration level in between municipality and county.

By the given hierarchy a city is administrated by a municipality.

The postal address in most cases is build using the municipality, not the city (sometimes the names of these levels are the same, then you do not see the difference). And the postal address is build by using street name and house number, not the farm name. The farm name is part of PLAC, a different street name with house number (pointing to this farm) is not part of PLAC.

What are the addresses for? To store this information as it is given by several sources. Like civil birth and death records, books with all addresses of people living in a city in a defined year (like census), letters written from sender and his address to receiver and his address, contracts of two parties identified by name and address. Often the address is a very important information to identify the individual: Someone with same name is living in the same village at the same time. Zip codes are very important to identify municipalities (we have a lot of the them with same name, however in different counties or states).

If you do not document the addresses, you do not need the ADDR. Your decision. If you do document, you need it. And GEDCOM is to transfer all data we collect in our genealogical research.

Norwegian-Sardines commented 3 weeks ago

I can understand your requirement to record the mailing address in these cases although I would only add it as a NOTE for the fact rather than as a Formatted Mailing Label, because the formatting has less value!

Currently, the Place_Structure and the Address_Structure are not related to each other. Event_Detail

This means that we can record an address {full formatted address as it would appear on a mailing label, including appropriate line breaks (encoded using CONT)} related to the fact that is structurally different than (and independent of) the Place_Structure. This would solve/work for your intended need, recording a mailing address that is not the the same as the location (PLAC) of the "fact".

Adding the mailing address for all levels of the SPLAC binds that mailing address to that SPLAC instance, and for all uses of that SPLAC instances and would require a date driven Address_Structure as well if the "location" stayed the same but some element of the Address_Structure changed.

I would rather see the Place_Structure, Address_Structure and the Shared_Place_Structure all be separate data elements not dependent on each other

mother10 commented 3 weeks ago

Hi all, Have been reading your comments and thought about them. I would like to step back and see what we are doing here. (For the "we" read in fact you all) We started with a PLAC structure and that was cut into separate individual pieces, those pieces could be linked together in a defined way, to form the original PLAC. Now the cutting gave in fact Jurisdictions, so why were they called SPLAC? Because they came from PLAC? They are new, not yet existing structures for GEDCOM, so as they are in fact jurisdictions, why not call those JURIS structures then? To people using GEDCOM that might sound way more logical because thats in fact what they are. By naming them SPLAC, that sounds like "place" it gives confusion. They are no places in itself, only when together they form a complete string, they form a PLAC.

Chosing the right name for something can help tremendously in understanding new things. As I said in the starting post, they work like REPO and SOUR, only this time they are not just two steps, they form a whole "staircase". The bottom step is the position where an event happened, the topmost step is the country where it happened. To have a correct "staircase" we need a topmost step, 1 or more steps in between, in sequence of area size (as jurisdictions are), and a bottom step. The bottom step should be the most accurate position we can have for an event. Often that can be a home, but it can also be a farm or a commune or other things as I mentioned before, and as others stated here. And maybe we only have a placename.

Should the topmost step be "WORLD" because then we can link events on the Atlantic and such, directly to JURIS "WORLD" and we would also have a correct PLAC according to the definition.

Now the "mailing" Addresses. They give confusion, thats why I started by saying we should leave out CITY, STAE, POST and CTRY. They should be, like Luther said, EXID kind of things. Their contents should be the responsability of the user. Why? Because there are so many ways addresses can be written. And its almost imposiible for a program to construct a correct one. So in case an address is needed for a contract of a lawyer and such, that should go, by the user, in an address, where the address is more like a NOTE with more lines of text, controlled by the user. (as Norwegian-Sardines said) That address could be used for a letter or parcel to send to, but I dont think it should be the responsability of GEDCOM to check if its a correct address. GEDCOM should treat that as a NOTE.

The other type of address, as I mentioned on the "housecards", they have a street and a housenumber, or, as Albert said, in older times they were just housenumbers in a small village. Maybe we should better call those "Positions". These addresses/positions form the lowest "step" of the JURIS staircase.

They are not used to send a letter, only to position an event on a map. (Hence I said they all might have a MAP structure)

It can very well be we cannot really locate those "address"-JURIS's (anymore) because we have no idea where that street and housenumber might have been. They could be destroyed for some reason (flood, bombing, anything) or might have disappeared because new houses were build there. But no matter what, those smallest JURIS's (Positions) are the bottom step in our PLACE staircase.

Maybe we dont know the smallest JURIS, and we only know the City, then that is the lowest step of the staircase.

As @Norwegian-Sardines and others said, the way of addressing a "position" varies much accross countries and even inside one country. The sequence of the steps is not the same everywhere. So the JURIS's should have a TYPE, same as the jurisdiction names in the PLAC statement has now. Extra TYPE's are needed because of the variations in the countries for jurisdictions. (Like Municipality, CODE-Insee and more.) In 1 "staircase" each TYPE is only allowed once. The sequence of the TYPE's is the users responsability.

JURIS should have the possibility (as Norwegian-Sardines mentioned) to store extra information to describe a city or a state or more. Adding pictures should also be possible. If we define something new, lets think carefully about what should go in there, so we dont have to add forgotten things too soon after the release.

I would like to add 1 extra TAG here. As the PLAC is now, with jurisdictions that can be empty, that gives users sometimes placenames like: , Den Helder, , So with "empty" comma's. Now the same as the proposal for the new NAME structure has a TYPE "RUFNAME" to denote which name should be underlined, I would like to propose a TAG CITYNAME (or alike) to denote which of the JURIS tag names should be presented to the user. This tag is only allowed in 1 (one) of the steps in the JURIS staircase.

Norwegian-Sardines commented 3 weeks ago

I'm still going through all of what you wrote, so I'll answer smaller bits as I find them!

Now the cutting gave in fact Jurisdictions, so why were they called SPLAC? Because they came from PLAC? They are new, not yet existing structures for GEDCOM, so as they are in fact jurisdictions, why not call those JURIS structures then? To people using GEDCOM that might sound way more logical because thats in fact what they are. By naming them SPLAC, that sounds like "place" it gives confusion. They are no places in itself, only when together they form a complete string, they form a PLAC.

For me (and maybe this is just my interpretation) a "Place" does not stop with "jurisdictions"! If I have a BURI fact I also want to have a place to record the location of the exact grave marker. I the software I use we can map to a specific spot on the map include the exact location of the grave. The same is true for RESI (residence) I want to include the street address or farm name in the PLAC tag and map that location. This is why I've used the SPLAC tag, it is a "Shared_Place".

In addition, a place can have multiple parent places. Let's take an example: The city of Gdansk in what is not Poland was for a time part of the Prussian Empire and before that was officially in the "Kingdom of Poland". Individuals born in various parts of Poland during times of annexation and control could have birth certificates stating Prussia. The hierarchy for Gdansk should include all possible parent jurisdictions with From/To dates. My own grandfather was born in "The Hungarian Empire" the town has changed names but it is now in Serbia! We need alternate names with date-range (from/to)!

One other reason I want to have separate SPLAC nodes is so I can associate formatted text with the node to describe the place for my readers who are not "history people" but still may be interested in the history of the place, be that place a farm, city, country.

We could name SPLAC something else but my vision is to deprecate PLAC at some point.

albertemmerich commented 3 weeks ago

Hi Tinneke, JURIS would not meet the real sense of places, as they include buildings, cemeteries, churches, villages. All these objects are not jurisdictions, only the administrative levels following in the hierarchical structure of PLAC are jurisdictions. The German GEDCOM-L group did not use SPLAC, because the old PLAC is not what the new object / record will be: PLAC is the name of a place, added with comma separated hierarchical administrative jurisdictions. "SPLAC" is quite another object: it is the object on a certain level of all these buildings, villages, and jurisdictions. It may have children objects (lower level objects) or parent objects (higher level objects). These objects may be places, administrations, religious objects (like churches and parishes), geographical objects (like a region or a continent). The German GEDCOM-L group called all these objects "locations" and created the tag _LOC for it. These location records are already exported by some of the programs within the GEDCOM-L group, and webtrees has an extension to include this solution, too. I am with you, that SPLAC is not a good name for these kind of records, as they are NOT shared PLACes, but they are objects of quite another type. They include a lot of data of the substructure of PLAC, but the PLAC itself is modified to a hierarchical system of records and has no longer comma separated hierarchical "jurisdictions"!

The _LOC record has a broad set of subtags. You find it in http://genealogy.net/GEDCOM/GEDCOM551%20GEDCOM-L%20Addendum-R2.pdf on page 13, defined as extension to 5.5.1 (it works with 7.0, too)

Norwegian-Sardines commented 3 weeks ago

Albert,

I don’t see any difference between the word “place” (a particular position or point in space) and “location” (a particular place or position) Oxford Dictionary. So the only reason I use the tag SPLAC is because it is a place on a map that is shared and it replaces PLAC at some point. And SPLAC is similar to the new SNOTE.

Norwegian-Sardines commented 3 weeks ago

Other software has an extension that extends the PLAC tag to a shared (0 level) position but does not create a hierarchy. So SPLAC can be a used by them without much change in their code!

mother10 commented 3 weeks ago

Thanks for all input. Now we wait what the committee decides.

tychonievich commented 2 weeks ago

Discussed in steering committee

There are multiple open issues here that deserve additional thought, design, and discussion. Some of that will happen in the open meeting on SPLAC, announced in #538, though this is a big enough topic that the conversation will probably extend beyond that meeting too.

Norwegian-Sardines commented 2 weeks ago

If the <> is to be maintained as subtags of PLAC and Shared_Place_Records I think we should revisit the design of the <>. Currently the <> has the following definition:

The payload is the full formatted address as it would appear on a mailing label, including appropriate line breaks (encoded using CONT (p.73) tags). The expected order of address components varies by region; the address should be organized as expected by the addressed region.

Which for the most part is only valuable when either viewed by a human or used when creating a mailing label. The individual data points within the structure can not be parsed into meaningful parts and is of little value beyond what we could put into a NOTE except that it is defined as an address.

The other tags found in the <> {ADR1, ADR2, ADR3, CITY, STAE, POST, CTRY} (as noted in the GEDCOM Specification) are best used by systems that may "have structured their addresses for indexing and sorting." and are subject to deprecation. This may be a good time (v7.1 GEDCOM) to actually deprecate these subtags. We could also then instead of having a "Structure" just implement the ADDR tag with implied CONT subtags.

mother10 commented 2 weeks ago

@Norwegian-Sardines Yes I agree with that! I think I already said to take out City etc.

Norwegian-Sardines commented 1 week ago

In advance of the upcoming meeting I'd like to express some thoughts for discussion on the topic of the PLAC tag in a future release.

These thoughts are not about a specific solution (structure or design), but instead some of the elements I'd like to see in the solution.

1) I was trained in science to record my observations, therefore the solution must contain a data element dedicated to the observed "Place of the Event" 2) Interested parties, present day readers and modern maps need current location names, therefore the solution must contain a data element dedicated as a finding aid that describes the "Current Place Name of the Event" 3) Because places can be know by several names and/or have different superior jurisdictions in its history, the solution must contain a data element dedicated to "Alternate Names". Optionally Dates and Language information may be included. 4) A place, is like an Individual, with a history that goes beyond its name, the solution should contain a data element that can be used to record information about the place. History information may directly related the target family or individual. (Language use in the Text should be included in this element) 5) Images (maps, photos) from/of the place are important to maintain and distribute. The solution should contain a data element that can be used to record and maintain digital images of the place of the event. 6) Map location coordinates of the place. The place may not be found on general purpose maps, such as graves, homes and other buildings. A researcher may have recorded a location in the field, therefore the solution must contain a set of map elements to record the location of the place. 7) A way to reuse any of the above data elements for any event that occurred in the same place to reduce redundancy and increase accuracy.

This list can be discussed, revised and augmented and can be used as a starting point for any design ideas.

Thanks

albertemmerich commented 1 week ago

I would like to add:

  1. Reference to elements in location databases in internet (As by FamilySearch, geonames, gov.genealogy.net)
  2. link to webpages describing the place
  3. type of place. Optionally Dates and Language information may be included.
  4. Zip code of place. Optionally Dates information may be included.
  5. Pointers to higher level elements (political, religious, geographical hierarchies). Optionally Dates and Type information may be included.
  6. Source citations for all data

Adding to 6.: for bigger elements not only coordinates, but the area

Norwegian-Sardines commented 6 days ago

I think addition number 12 references design of the solution rather than a desire! A solution may not include "higher levels" as indicated in previous discussions!

But this can be discussed in detail tomorrow.

albertemmerich commented 6 days ago

I agree "pointer" is part of a solution. Desire is to document all hierarchical connections in between places on different levels at any time.

mother10 commented 5 days ago

Maybe the <> as @Norwegian-Sardines said, should be an ADDR with CONT and CONC indeed. But I wonder if ADDR should not have a TYPE too. One TYPE is just for addressing labels, the other TYPE is for describing the place of an Event. I can have a child born at ChurchRoad 12, but I am not able to give birth to a child inside P.O box 1234.

And the Addressing label type, should not that have the possibility to have a Barcode added? As when I send/recieve parcels I see everyone scanning barcodes, and when you send a parcel, it gets a barcode that is scanned at every step the parcel passes. Dont know if that fits in the coming discussion too.

Norwegian-Sardines commented 5 days ago

Maybe the <> as @Norwegian-Sardines said, should be an ADDR with CONT and CONC indeed. But I wonder if ADDR should not have a TYPE too. One TYPE is just for addressing labels, the other TYPE is for describing the place of an Event. I can have a child born at ChurchRoad 12, but I am not able to give birth to a child inside P.O box 1234.

And the Addressing label type, should not that have the possibility to have a Barcode added? As when I send/recieve parcels I see everyone scanning barcodes, and when you send a parcel, it gets a barcode that is scanned at every step the parcel passes. Dont know if that fits in the coming discussion too.

What I do for Addresses of places is:

2 PLAC ChurchRoad 12, My Town, My State, My Country

I don’t need an <> for this case.

Dirk-Ahnenblatt commented 4 days ago

What I do for Addresses of places is:

2 PLAC ChurchRoad 12, My Town, My State, My Country

I don’t need an <> for this case.

Some comments from a genealogy software provider ...

  1. This may work for a residence (although I'm wondering whether it shouldn't be “12, ChurchRoad, ...”, or whether “My Town” should also include the zip code), but what about other addresses like hospitals or cemeteries.

1 BIRT 2 PLAC University Hospital, ChurchRoad 12, My Town, My State, My Country

1 BURI 2 PLAC Central Cemetery, ChurchRoad 12, My Town, My State, My Country

One more hierarchary level. All possible, but makes it not easier, because not needed in every case.

  1. GEDCOM is just a transport medium for genealogical data, but the real usage is in genealogy software. What would be the output for i.e. birth place in your example? The whole PLAC string? Or just the first part and truncated from the first comma?

That's why I am not a big fan of these hierarchy in PLAC tags and prefer the town/city on first place in PLAC tag (What is your birth place? ChurchRoad 12!).

Conclusion: I would prefer not only <> but also ADDR records, because there is a lot of redundancy in a single GEDCOM file if all residences are filled and additional information (i.e. coordinates) would be a benefit.

Dirk (www.ahnenblatt.com)

Norwegian-Sardines commented 4 days ago

My birth place:

2 PLAC City Center Hospital, Birth Town, Birth County, Birth State, USA

One more hierarchary level. All possible, but makes it not easier, because not needed in every case.

What does this mean? If it is not needed don’t put it in!

If I did not know the hospital name, 2 PLAC Birth Town, Birth County, Birth State, USA

What would be the output for i.e. birth place in your example? The whole PLAC string? Or just the first part and truncated from the first comma?

In my software two things are output: 1) A map based off the PLAC data 2) The PLAC data.

Since GEDCOM v5.5.1 does not have a shorter (abbreviated) value the whole string is output on the screen and/or report. Simple!

NOTE: My proposal would be to include an abbreviated place name!

Norwegian-Sardines commented 4 days ago

This may work for a residence (although I'm wondering whether it shouldn't be “12, ChurchRoad, ...”, or whether “My Town” should also include the zip code), but what about other addresses like hospitals or cemeteries.

1) You could put the 12 before the street in some countries they say the address, street then house number. In my area we say the house number first. 2) The comma after the number is ok, I’d prefer it without. In GEDCOM the comma represents a different level in the hierarchy and in my software this means the street gets a map location as well as the house and the city.

“My Town” should also include the zip code

If it has to be there then include it, but in the USA a town could have dozens of zip codes and it provides no extra value locating the town on the map. In another country the zip code may be assigned to a town and used to differentiate two towns with the same name such as, “MyTown 1234, Country” from “MyTown 5678, Country”.

Dirk-Ahnenblatt commented 4 days ago

My birth place:

2 PLAC City Center Hospital, Birth Town, Birth County, Birth State, USA

One more hierarchary level. All possible, but makes it not easier, because not needed in every case.

What does this mean? If it is not needed don’t put it in!

If I did not know the hospital name, 2 PLAC Birth Town, Birth County, Birth State, USA

The meaning of each part between the commas should be defined in HEAD.PLAC.FORM. In your case ...

1 PLAC 2 FORM Place name, Street, City, County, State, Country

... where "City Center Hospital" is the Place name. When using hierarchical place names you can omit one part (like Place name ) but have to use the same amount of commas. Otherwise the PLAC value can't be parsed and is just a line of text.

About street numbers and zip code: I thought you want to have a hierarchical place structure within PLAC tag defined by HEAD.PLAC.FORM. It seems that I was wrong. I thought you use the PLAC value as a replacement for the address structure. But you are not interested in zip code (as in ADDR.POST). Your PLAC value is just a line of text. It could start with a place name (i.e. hospital name), street or city. So statistics about most used places in your software are not possible.

I personally would prefer to have only City as PLAC value and more details in _LOC/SPLAC record (i.e. State, Country) and (or SADDR record - i.e. name of hospital or cemetery).

Dirk

Norwegian-Sardines commented 4 days ago

I don’t use PLAC.FORM it has no value to me or anyone that uses my software!

Place Statistics start with the highest order (country), all levels should be consistent as you go down. I see no problem!

If I need to know the number of people that had an event in a city that city is described once and reused!

Norwegian-Sardines commented 4 days ago

Dirk - I thought you use the PLAC value as a replacement for the address structure. But you are not interested in zip code (as in ADDR.POST)

I personally think the Address_Structure as defined in the GEDCOM v5.5.1 specification has no value in a Genealogical Database.

v5.5.1 - The address structure should be formed as it would appear on a mailing label using the ADDR and the CONT lines to form the address structure. The ADDR and CONT lines are required for any address.

This structure (as described above) is best used to send letters and mail to the individual. First, a large percentage of individual in most genealogical databases are dead so they will not be getting mail from me! Second, Using the Address_Structure for anything other than a person's residence does not make sense, why send a letter to their church, hospital, or the place they lived in a census 50 years ago? Third, I am not a fan of having address information for the living in a GEDCOM that can be shared to and used by others. Privacy for the living is important and we have no way to prevent a PLAC from being transmitted with an address.

However, I can understand if you want to record an address for an individual's residence. I think it should be record in the RESI as a line of text maybe as part of a SNOTE which has Privacy coverage and can be prevented from being transmitted.

NOTE: ADDR.POST will probably be deprecated in the future! The v5.5.1 GEDCOM specification says:

v5.5.1 - The additional subordinate address tags such as STAE and CTRY are provided to be used by systems that have structured their addresses for indexing and sorting. For backward compatibility these lines are not to be used in lieu of the required ADDR and CONT line structure.

I am interesting in zip-code if it defines a place uniquely where using just the city or region does not.

albertemmerich commented 4 days ago

If you do not use PLAC.FORM, how do you know what place this PLAC: 2 PLAC Kansas, USA describes? A city? A state? Without link to any description uniquely defining the place, the PLAC.FORM is one of the most important features in GEDCOM up to 7.0. In 7.0 we started to have PLAC.EXID which for the first time was a possibility to identify a place without the hierarchical structure in PLAC payload. Now I can do:

2 PLAC Kansas, USA
3 EXID KANSASEM33MU
4 TYPE http://gov.genealogy.net/

So it is the city in Clark County in the Arkansas state. But now, how do you find out, which other cities in Arcansas are in your data?

With the hierarchical structure of _LOC records as defined by GEDCOM-L group no problem: Pick all records which have Arkansas record as next level record, pick again all next sublevel until you find the elements on level "city". If you want to know all lower levels, too: No problem, look for sublevels until there is no element found any more.

Norwegian-Sardines commented 4 days ago

If you do not use PLAC.FORM, how do you know what place this PLAC: 2 PLAC Kansas, USA

I don't really care what type of place this is! It is the place where an event happened. If I take the text "Kansas, USA" and put it into Google Maps it will find the State and it will show on a map. If you transmitted "Chicago, USA" or "Chicago, Illinois" it would find it too!

If I transmitted "Kansas City, USA" it would need more clarification, Missouri or Kansas? So you should be transmitting: PLAC Kansas City, Kansas, USA or PLAC Kansas City, Missouri, USA

This is why it is imperative that a unique name be transmitted that can be found on a map! How do you know what/where is with the PLAC.FORM: 2 PLAC Lincoln, USA 3 FORM City, Country

albertemmerich commented 4 days ago

Zip code is one of the most safe ways to identify a place, where the same name is used for several places. So it is one of the easiest ways for a safe description of places. I have seen a lot of GEDCOM files in the wild using the zip code within the PLAC payload of 5.5 / 5.5.1, like 2 PLAC Hanstedt, D-27793 or 2 PLAC Hanstedt, D-21271 or 2 PLAC Hanstedt, D-29525 All of them are today in the state of Lower Saxony (Germany), the first in the county of Oldenburg, second in county of Harburg, last in county of Uelzen. As the administrative organisation is modified more often than the system of zip codes, it is a good way to identify places. And used in the wild despite violating the GEDCOM standard, as the zip code is not an administrative level in the place hierarchy. But at import I can identify those places automatically by searching for the zip codes in the GOV system on genealogy.net. GEDCOM-L has put the zip code as one of the subtags in their location records, using tag POST. I use it very often. I would like it to see it in a future place record, else I had to implement an extension tag _POST in the record.

Norwegian-Sardines commented 4 days ago

Oh, and by the way. For my clients and family in Norway I would actually transmit or display.

2 PLAC Kansas State, USA

Just Like I would enter: 2 PLAC Rogaland Fylke, Norway

or

2 PLAC Chicago, Cook County, Illinois State, USA

or

2 PLAC New York City, New York State, USA

Norwegian-Sardines commented 4 days ago

If I used a well managed, maintained and curated offsite database then these values would have a more disciplined naming structure and therefore also not need PLAC.FORM because the offsite database would be the expert. For most individuals in the wild (not you) they don't care about knowing about or using PLAC.FORM it is only when you are building a repository like FamilySearch or geonames that they care, Most real users and software implementers would use these resources to create a "correct" place hierarchy and not be creating the resource on their own,

Norwegian-Sardines commented 4 days ago

Zip code is one of the most safe ways to identify a place, where the same name is used for several places. So it is one of the easiest ways for a safe description of places.

I agree to a point. In my town of 47000 people we have 9 different zip codes, which one do I use?

Norwegian-Sardines commented 4 days ago

Personally in your case were each town has only one zip code you could enter the following:

2 PLAC Hanstedt D-27793, Oldenburg County, Lower Saxony, Germany or 2 PLAC Hanstedt D-21271, Harburg County, Lower Saxony, Germany or 2 PLAC Hanstedt D-29525, Uelzen County, Lower Saxony, Germany

But in really in my example the county without the zip code still makes the place unique. So you could also have: 2 PLAC Hanstedt, Oldenburg County, Lower Saxony, Germany

NOTE: I removed the comma between the town and the zip code because they go together to make a unique place and should not violate the GEDCOM Standard for creating a non administrative level.

Dirk-Ahnenblatt commented 2 days ago

Now I have read the complete discussion here and would agree to all of @mother10 's proposals.

As I understand PLAC is going down just to the city and gets an SPLAC record for more detailed information. ADDR is in my view not just a postal address, but a more detailed description of the place, building or location. This will get BLDNG records to avoid redundancies and add more details like coordinates. I personally would prefer the term "location" (instead of building). Otherwise we would see the term "Building" in all future software dialogs.

What I also like is the idea of a new tag CITYNAME. I would wish that it is not necessary, because PLAC has already only the value which is presented to the user (like CITYNAME purpose). But there are so many GEDCOMs around which have these comma-formatted PLAC values.

About PLAC, FORM and commas: FORM was introduced with GEDCOM 5.3 to allow a hierarchical structure of the place name in order to specify a place more precisely and distinguish it from places with the same name. But this seems to be only a work-around because a lack of better options (such as SPLAC records).

In GEDCOM 5.3 there are several samples ... FORM city, county, state, country ... and ... FORM City, County, State, Country

No pre-defined jurisdictions? Should all terms be written in lower case or does the spelling not matter? Do these terms always have to be in English? What about the use of additional jurisdictions? Each software can invent "own" jurisdiction terms (like "building name" or "street with number"), which in the end only exist in this software. And there is software which doesn't use FORM tags, even if users enter these comma-separated PLAC jurisdictions (on their own risk).

This would be no problem if software handles this comma-thing internally and presents users only the name of the city. But to keep the right hierarchy levels of a PLAC value is in most cases the duty of the user.

I never came across software that is checking numbers of commas in all PLAC values and gives warnings. What about merging/adding/importing GEDCOM files? Will FORM values and PLAC values/hierarchies (number of commas) be changed? It should be checked - but in reality it is not.

Every software has built its own strategy how to deal with these comma-separated values (including "empty" commas) when generating outputs.

In my eyes, with introducing SPLAC, the PLAC/FORM comma-separated hierarchy structure should be declared dead. There should be no inforcement in the GEDCOM documentation to still do it. It should be replaced by new SPLAC + BLDNG records (final name of BLDNG record can be discussed - I would suggest LOC).

Dirk