Status

So far I've identified a means to accomplish this from page 8 of the Zoning Effect paper. Later i'll look into a means to accomplish this.

About

What i really want is a means is to compare value of land on an site independently of the size of the site is attached too. Which should help make visualisations within an LGA where people want to live based on the assumption the land value has priced all preferences in.

Why have i sought this out

Here is the sites in the CBD ranked by $ per sqm, look at the area of these sites.

While there are cases of data issues in the valuer general in the data, when you sort the sites within in sydney by land value by SQM, it seems like smaller sites tend to rank higher on a basis sqm basis.

I think this is because there's a marginal rate at which land increase in value where the next meter will be worth than the last. As you increase the size of land more types of projects become viable, but any meter you add after the earlier meters does nothing for the projects that viable regardless these extra meters.

Why does this bother me

If we had the shapefile for every lot this wouldn't actually be a problem, but because we are aggregating by meshblock, a single Telstra phone booth can inflate the aggregated value of the meshblock if it's something like max, or mean or something.

If you wanted to compare the value of land in Sydney independently of size of the lot, and you're grouping by things like meshblock. Aggregations like max or mean will be heavily skewed by this if you have a random 1x1 Telstra phone booth, which is the case in a few places in the CBD.

It would be nice to be able to weigh each meter independent of the size of the loot it's actually too, but maybe that is fanciful.

Hacky solutions

So far I've included something like this in my land value aggregations, but it's fairly arbitrary and dishonest

CASE
  WHEN p.area < 10 THEN 10
  ELSE p.area
END

Note, this isn't used in the data ingestion process but instead in the other notebooks where I've been trying to visualise the data.

Possible projects once accomplished

This will allow for the creation of a visualisation of different LGAs that show where the most valuable land is that isn't skewed by small sites.

Maybe get a distribution of land values in an area by doing following for each site

def marginal_value_of_land(valuation, nth_meter):
  # The paper says this
  #    log(sale price) = c + b log(land area) + aX + e
  #
  # it would be neat to do something like
  #    (b log(nth_meter)) - (b log(nth_meter - 1))
  #
  # I don't even know if it makes sense to do that... 
  # Possibly useless. maybe this is more reasonable 
  #    b log(1)
  pass

def population_of_all_land_values(valuations):
  for v in valuations:
      for nth_meter in range(0, v.sqm_area):
          yield marginal_value_of_land(v, nth_meter)

With that population of land values you can see the distribution of land values, I'm honestly unsure what the most sensible way to do this is...

Solutions

It's possible this methodology for comparing land by these aggregations is flawed and I should look at other methodologies.
It's possible there some kind of coefficient you can figure out from hedonic pricing models?
this problem is a problem because I'm aggregating multiple properties by mesh blocks, if I had the shape files for the actual properties this wouldn't be a problem

It's entirely possible I'm looking at this all wrong, I think first, it's best to establish a better understanding of the nature of things first before proposing a fix. Let's see what research says about it.

Consider reading the RBA paper, on Zoning Effect

Notes Reading Paper "The effects of Zoning on Housing prices"

20240912
- page 2, there's immediate mention of a "marginal value of land"
- page 5, mentions it's worth noting since lands may be deflated as sometimes land owners despite the high valuations of their land by not lower.
- page 8 (web link), here marginal value of land is explicitly mentioned.
- this relation was shown log(sale price) = c + b log(land area) + aX + e
- Is it possible to use this with substitution to get the marginal value of land?

AKST / Australian-Address-Boundaries-Land-Property-Price-Database

Find way to factor in the marginal value of land when aggregating many sites #9