burroughsapp / burroughs

Burroughs is an historical database of places in Lawrence, Kansas with a time-machine experience for the user dialing back exterior and interior views of city blocks, buildings, and businesses through the years, decades, and centuries, with building and business histories as well as public discussion walls tied to each physical location.
0 stars 1 forks source link

Linking Establishment Existences to Buildings #8

Open stevedahlberg opened 11 years ago

stevedahlberg commented 11 years ago

It seems like this can be done via Locations.

Once Locations are tied to Buildings based on an address field (that can handle ranges of addresses for that "Building" aka instance of a Building) then if the Location indexed to the Establishment Existence can be located in a Building's address range field for a Building entry whose date range intersects that of the Establishment Existence then the Establishment Existence knows what Building it is/was in and conversely the Building show be able to know what Establishment Existences it contains/contained for every intersection between the two (this way both sides know about the other even if the Establishment Existence persisted across a change in buildings at that Location and of course the more frequent case of the building persisting through many Establishment Existences.

ruralocity commented 11 years ago

You can now id a location to a building; when that happens that location's establishments are associated to the building. See Teller's. Does that accommodate the frequent cases?

stevedahlberg commented 11 years ago

Let me test it some. Can a Building location be a range of postal addresses? That may be key when talking about Buildings in some contexts (not entirely sure yet). Not sure that's how we need to do it - just saying that most buildings have a range of postal addresses associated with them going back more than a century and that these sometimes change when a building assimilates an adjacent building or when part of its footprint gets demolished (think Eldridge Hotel) then later over time the building gets expanded again - but the postal address associated with that part of the footprint was always there, even if a business within the building had a single specific address.

The other situations I'm trying to wrap my brain around: Tellers is at 746 Mass but encompasses the range 742-746 because in 1930 when it was remodeled to become First National Bank from Merchants Bank, they lopped off the third story and assimilated the adjacent building that was at 742 Mass, so the 746 building expanded to included the 742 building (which ceased to exist) and the resulting building now has the range 742-746 even though Teller's the establishment existence only has 746.

Something very similar to this happened with both Weavers Dept store (assimilating the adjacent low building which is now the Menswear section) and at Sunflower Surplus (expanded at some point) and Paradise Cafe. The first two plus the Eldridge are on our short list though so if we can address these issues and make these buildings work, then we should be in a good position to attack the potentially even more complicated examples of Paradise Cafe and particularly Bloom Bath / Rudy's / Dusty Bookshelf / Gould Evans. The latter set is probably the most complicated arrangement we'll have to deal with. It started as multiple buildings with Bloom as the first movie theater in Lawrence (there was a vaudeville theater down the street at the Granada), including an associate of Thomas Edison who helped invent some key component of early movies but now with half-levels up and down and businesses spanning parts of adjacent buildings and Gould Evans slowly spreading all the way down the block to the alley (cut-through mid-block) on the upper levels of this row of buildings it's quite confusing :)

stevedahlberg commented 11 years ago

Also, will editing the building name field in Buildings break the linkage?

ruralocity commented 11 years ago

It shouldn't; everything is identified by a serial ID.

On Apr 5, 2013, at 10:03 AM, Steve Dahlberg notifications@github.com wrote:

Also, will editing the building name field in Buildings break the linkage?

— Reply to this email directly or view it on GitHub.

stevedahlberg commented 11 years ago

After our discussion yesterday about Buildings being associated with a collection of Locations (as opposed to one) and going the route of just entering multiple Locations and tagging them to same Building to accomplish this, and, conceptualizing similarly for Blocks, where Block don't only indirectly contain Buildings via Blocks also being associated with a collection of Locations, then, I realize now the same is true for Establishment Existences. An Existence will be associated with a specific set of Locations which definite its physical presence during that era, whether it be one, or two contiguous or multiple non-contiguous locations during that era of existence.

But, I'm a bit unclear about the same approach here of making multiple entries of Existences to achieve the same effect as with Buildings (and Blocks when they're set up), as there is nothing to insure uniformity of an Existence if it's just going to group them by what, description? And would they tally right? So, I'm still confused about how to give each Existence an association with a collection of Locations ...

ruralocity commented 11 years ago

I'm not following--might need to have a conversation about this. Or maybe I just need a diagram or something to help me understand. I'll try drawing out the schema tonight to aid in this discussion.

stevedahlberg commented 11 years ago

I'm just not have been clear on how to achieve through the data entry process multiple Buildings per Location (to address the case where Building footprints change and Locations belong to one Building at one time and to another at another time) and particularly was confused about how to do the same with Existences, where an Existence might be associated with multiple Locations.

For the Buildings, when we talked at lunch I just assumed I'd enter multiple identical Locations when necessary to be able to tag them to different Building records. But now that they have Lat and Long and other info, it does seem weird to enter all that Location information multiple times to create multiple identical records in order to achieve the same Location being associated with different Buildings at different points in time.

In short, I see how a Building can have many Locations but I'm not sure how a Location can have many Buildings. I think a similar issue exists for Existences. Or am I missing something obvious?

stevedahlberg commented 11 years ago

IMAG7861_web

stevedahlberg commented 11 years ago

Somehow it seems to me that since Locations are static in both place and time that they wouldn't be linking out (via indexed ID field) to anything else like Buildings or Existences (or Blocks, later) because they just are what they are. But, inversely, it does seem like Blocks, Buildings and Existences should link out to Locations but that they need to be able to specify more than just a single instance of what they're linking to (an elastic list rather than a single record) ..... What do you think?

ruralocity commented 11 years ago

I don't think a location should ever belong to multiple buildings, but we do need a way to distinguish different instances of the same physical location over time. That's where I believe adding a set of date ranges to location comes into play. We could also route locations under buildings, but that would require that all locations have a building. (So the URL would be http://blah.com/buildings/building_id/locations/location_id).

For the bottom left, an existence is by current definition an instance of an establishment at a single location. If the establishment moves, it's to a new location. If that's not the case then the data model needs to be redone, I think, as this is a major piece of how everything links together.

Looks like GitHub doesn't support PDF uploads, but you should be able to access a diagram of the current entities as they now exist here: https://github.com/ruralocity/burroughs/blob/master/erd.pdf . To summarize:

stevedahlberg commented 11 years ago

Am still parsing the diagram but something strikes me:

  1. "I don't think a location should ever belong to multiple buildings, but we do need a way to distinguish different instances of the same physical location over time. That's where I believe adding a set of date ranges to location comes into play."

But here's the thing: We're assuming locations don't change over time so it seems counter-intuitive to be creating multiple instances of locations over time, especially when Buildings already ARE different instances over time (as well as different instances in space). It is for this reason that a location really could belong to "different" Buildings - I think the notion that Locations shouldn't belong to different buildings stems from thinking about Buildings as existing in different spaces only, not in different times, but that's how we've defined buildings - to be existing as different instances in both.

stevedahlberg commented 11 years ago

So, the second problem is similar - Existences need a way to link to multiple Locations, just like buildings do, also footprint-related: A given instance of a business quite frequently spans multiple locations, not just one.

stevedahlberg commented 11 years ago

Just brainstorming here, but what would happen if "locations" had no "_id" fields at all but instead Buildings acquired a "location_id" field like Existences has except that in both cases the "location_id" field becomes an array of integers rather than an integer?

ruralocity commented 11 years ago

Here's a stab at explaining my current understanding of the model, with the caveat that there may be outliers requiring special handling:

Now where things get twisted:

So I think this means building BB has locations LD and LE, and maybe a new location LF? If that's the case then yes, the building-to-location relationship is many-to-many, but needs to be qualified with a date. This would be similar to how we handle the relationship between a location and an establishment, via the intermediary existence. A model/table would sit between buildings and locations and contain the two foreign ids and dates (and probably a description and source, just because we can).

I'd like to get that sorted out before thinking about existences hopping between locations, if possible.

stevedahlberg commented 11 years ago

I'll need to draw out a diagram of what you said, and take a look. But let me just point out that currently anyway Buildings already have dates (just like Existences) re: qualifying with dates. And yes, I think Buildings and Existences are similar with respect to Locations.

stevedahlberg commented 11 years ago

I see this as a 3D model where the XY plane is a map like plane and the Z axis is time. Locations are fixed in the XY plane as infinitely tall columns because they exist in the same place throughout time. However, Buildings and Existences are peppered throughout XYZ space as column pieces of various heights, and instances of both can exist above or below each other in the same (or multiple) column space(s) as the various Location columns.

This to me implies a data structure where the Locations are fairly set - you enter them once up front and they're fixed in space and time from there on out. But, since Buildings and Existences exist in various segments of time and are comprised of multiples locations at different times, it's up to the Buildings and Existences to keep track of (to point to) the static Locations as necessary, rather than the static Locations keeping track of what Buildings they belong to. (Hence the idea of arrays of integers for ID fields in Buildings and Existences - but if there is a different way to implement the equivalent result that's great)

Will draw out your example and see exactly what you mean ...

stevedahlberg commented 11 years ago

Regarding your example: LA = 723 Mass (25' wide lot with lat/long, not necessarily active postal address at all times) LB = 725 Mass (25' wide lot with lat/long, not necessarily active postal address at all times) LC = 727 Mass (25' wide lot with lat/long, not necessarily active postal address at all times) EA1, EB1, and EC1 are Existences of Establishments - which Establishments exactly don't matter here since they don't occupy time and space like Existences do, they're ancillary information, as is the actual postal address of each Existence (we need a field for PostalAddress in Existences). ----- 1910--------1920------1925----------1950-------1965----------> BA: ------------------------------------------------>| -LA EA1 --------------------->| -LB EB1 ---------->| EC1 ---->| __BB: -------------> __-LA E?? __-LB E?? __-LC E?? You don't really say what happens with establishments after LA and LB are joined together but if you had, I would have represented that establishment as occupying both LA and LB during the relevant time segment. Similarly for BB once it begins in 1965.

So, there only ever three Locations in play here, listed above.

Building BA needs to list an array of Locations [LA,LB] and BB would have [LA,LB,LC].

Regarding Existences, suppose that EA took over the combined space of LA & LB in 1925 (EA1 ends, EA2 begins) when they knocked the party wall out and also expanded into LC once the building was expanded in 1965 (EA2 ends, EA3 begins), then: EA1 has [LA], EA2 has [LA,LB], EA3 has [LA,LB,LC], EB1 has [LB] and EC1 has [LB].

ruralocity commented 11 years ago

Since Rails doesn't have a built-in table viewer and the admin_assistant add-on isn't quite doing the trick, I'm going to get the data out into a format you can play around with in a SQL client of some kind. I've got the Heroku data in Postgres format right now. Do you have a particular database you're comfortable with?

stevedahlberg commented 11 years ago

Postgres is fine ... in the past I used a GUI client called "Yog" or similar. Do you recommend a windows based client?

In parallel, hope to solve the whole many to many issues here ASAP so can start entering buildings in different points in time that refer to same locations and Existences that can refer to multiple locations for footprints that cover more than one Location ... Did my take on your example help? I think it's because I'm assuming that each 25 foot wide lot corresponds to a given address on Mass and with lat long can therefore be a static location that can be referenced by multiple buildings at different points in time and by given Existences that span more than one Location at a time ...... I've got a few hours set aside tomorrow just for adding stuff to db! :)

ruralocity commented 11 years ago

I've looked at it but not fully processed it yet. I need to think about it outside the constructs of the current schema.

Not sure of a Windows PG client but I'll email the file. I also converted it to MySQL--might be easier to find something to work with that.

stevedahlberg commented 11 years ago

I think I've got it. Not sure if I should draw it out or describe it or what. For purposes of discussion I'll assume we rename the "address" field of "locations" to "situs_address" (more technically accurate and to distinguish from "postal_address" which needs to be added to "existences" table).

In a nutshell, we have many to many relationships between Buildings and Locations and between Existences and Locations. So, between Buildings and Locations we need a JOIN table "building_locations" made up of buildings.id and either locations.situs_address or just locations.id depending on how descriptive the "building_locations" table should be. Notice then that a query/function/method called "get_footprint" operating on "building_locations" that returns all the rows matching a given id (building.id in this case) effectively generates footprints for Buildings (the sum of all situs addresses with their corresponding lat/longs etc).

Similarly, we do the same thing, with a JOIN table "existence_locations" between Existences and Locations made up of existence.id and either locations.situs_address or just locations.id depending on how descriptive the "existence_locations" table should be (should be consistent with how we do the "building_locations" table). Also similarly, a query/function/method called "get_footprint" operating on "existence_locations" that returns all the rows matching a given id (existence.id in this case) effectively generates footprints for Existences.

Related tasks:

__ rename the "address" field of "locations" to "situs_address" (more technically accurate and to distinguish from "postal_address" which needs to be added to "existences" table)

__ add to location(s) class/table additional static info useful for doing equivalencies later: "lotnumber", "lotnumber_older", "lotnumber_early", "situs_address_older", "situs_address_early"

__ add a class "postal_address" that gets populated with all of the postal addresses that get used by Establishment Existences over time (postal address does not always equal situs address). Not sure if, like buildings, we will stipulate that one Existence ends and a new one begins when the postal address changes (even if it stays in the same footprint)? This does happen. If so, then it just be a one to many relationship between postal_addresses and existences. It also handles the case where postal address for an existence doesn't match one to one with the situs address (corner lots of multiple situs addresses, sometimes the mailbox is moved outside of the situs address lot area ("location" in our parlance) that the existence occupies, the existence occupies more than one situs address, so forth. Might be good to just assume all situs addresses get added by admin up front and presented as dropdown list during data entry to prevent any indexing issues.

The result would be that we can find out what the postal address of a given business existence was and also for a given location what were all the various postal addresses used there over time, for a given building what were all the postal address used there during its existence (or at a given time), for a given postal address what were all the businesses associated with it (or at a given time), etc, etc. Mainly if nothing else we can see at a glance when someone examines an existence what the postal address was that was actually used for that existence and always list it with the info that is presented to the user, even though we're not actually using the postal_address to tie the Existence to the Location (because we're using locations via existence_locations instead).

__ utilities to convert street names and street addresses (the second would utilize the first) back and forth between current and older versions, e.g. "Henry Street" equals "8th Street" and "90 Massachusetts Street" equals "900 Massachusetts Street"

stevedahlberg commented 11 years ago

Here is a screenshot of the relationships. I went ahead and added a "block_sides" table and intermediate "block_side_locations" junction table (corner locations at least might get used twice in defining block footprints, so again a many-to many relationship):

relationships_2013-04-14

stevedahlberg commented 11 years ago

Sources should probably be linked to block_sides like the others, for when we have photos of one side or the other (or both) of a given block, to cite a source for the photo. Could probably also add a description for the block_side but not sure how to handle it being a side. But, it seemed like a good idea to define block sides rather than blocks, since so much of the content is a shot of one side of a block. Some shots show both sides, some show one or both sides for multiple blocks. Perhaps when entering a photo, could tag it to multiple block sides (like 3 "east" block sides in a row, or 1 east and 1 west of the 700 block, etc)

GIS - am excited about the potential for this - now that we can build footprints of existences, buildings, and blocks (Blocks, Buildings & Businesses for a slogan) if we get more into GIS than just one ID point for each location and add fields to our location class to store GIS polygon information (to define the actual footprint of the location) then as we assemble these base location polygon info sets into existences, buildings, and blocks we're effectively building superset polygons at will that define regions of GIS coordinates valid for that entity.

ruralocity commented 11 years ago

I've got the updated relationships in place. A few notes:

stevedahlberg commented 11 years ago

Emailed you but we need a way to enter new and edit current Existences (button beneath enumeration area as in buildings maybe? for new, and then in-row buttons in enumeration area for current entries, both would be consistent with other forms for now). ... Can't really test until I can do that (I entered "Sunflower Outdoor & Bike" but can't create any Existences for it), but I can see the Existence Location piece and am stoked to have a parallel process to Building Locations now.

Agree with the non-street parts of addresses and also understand about Sources linking.

ruralocity commented 11 years ago

I've got this done, waiting for better internet to try deploying.

stevedahlberg commented 11 years ago

Can't wait :)

ruralocity commented 11 years ago

Finally got this source uploaded.

stevedahlberg commented 11 years ago

Awesome, hope to test later this afternoon ...

stevedahlberg commented 11 years ago

Aaron -

Hey, hate to point this out after how hard you've been cranking through this - tested briefly as a first pass before potentially entering a bunch of existences later but while the Edit/Show/Delete and the New button for Existences does appear to be working wonderfully (!), on the Existences editing screen we're missing the "Existence Locations" piece (the sibling piece to the Buildings editing screen) ...

While I can enter the postal address, the mechanism for specifying multiple locations (in order populate the existence_locations junction table) isn't there, so can't specify a footprint (composed of Locations) as we enter the existences (as we now can for Buildings) ....

Do you think it will take much to copy or replicate the form parts from the Buildings side and incorporate into the Existences side, make sure they populate the corresponding table, etc ... ?

stevedahlberg commented 11 years ago

After thought to look at the current models and I'm curious about the differences in code between BuildingLocation which includes:

validates :building_id, presence: true validates :location_id, presence: true

versus ExistenceLocation which has no matching statements like that ... Would the same need for validation apply?

ruralocity commented 11 years ago

Work backward from Establishments. Click the Show button for a corresponding existence. Then click New Location. This should be New Existence Location--I just missed relabeling the button. When in doubt, though, check the URL for the resulting view--they're all set up in such a way that you can tell what it's actually doing, even if the button lies. Anyway, you should see a single-element form that allows you to select a location from a select menu; the existence is pulled from the URL.

Eventually I think a nice Javascript-based UI will do a better job at making this all more user-friendly, but I'm not as adept at JS so I've been focused more on the backend since it will need to be there regardless of how the UI shakes out. I also left validations out for now on this until we were content with how things are mapping right now--easy enough to add them once we know that things are piecing together.

stevedahlberg commented 11 years ago

Excellent - my bad - in fact, the way you have it it is more symmetric with the Buildings side. The extra layer of Establishments was throwing me for some reason.

As far as validation, the OCD in me was screaming at it being there for Building Locations but not for Existence Locations :)

stevedahlberg commented 11 years ago

Entered a few things as a test run for entering a lot more tomorrow , hopefully. Noticed a few trivial things (that can wait) like after adding a new location on the fly from the Existences locations screen no obvious way to get back to the Existence Location you were on before choosing to add a new location on the fly and some funny (cumulative non-unique) tallying of locations and funny (some locations in multiple-location-footprints of Existences not showing associated Establishments in) tallying of Establishments in the Buildings listing going on here and there but:

One more improvement (that might not be that time consuming?) that would help immensely even now during data entry would be when addition Existence Locations for an Existence to have the Existence's description visible below the title and above the panes for listing, just as it is for Buildings and Establishments - since the title alone doesn't always convey "which one" is being edited currently, it's really helpful to be able to glance up at the description to verify. Plus it helps double check prior data entry of Existence description info.

An example of such a screen would be:

http://infinite-waters-8556.herokuapp.com/existences/19

Hope that makes sense ...