the big training issue - Githubissues

kbagstad commented 12 years ago

@fvilla @bvoigt @lambdatronic

Alright folks I was gonna do this as an email then my better instincts won out so it's going in as an issue. I've successfully trained the following BNs: Puget carbon source, Puget flood sink, Puget sediment source, San Pedro carbon source, Madagascar carbon source, Madagascar carbon sink, California carbon source. Five minor issues are holding up all the other ones, of which fixing Puget and San Pedro are most urgent. Get these done and we should be set on training for everything except Ontario (and I imagine fixing these will make Ontario training go more smoothly)

Here are the problems I've encountered:

The soil carbon storage layer is causing problems for most of the carbon sink models. The data are there if you measure -d them, but model -d's show no data, and no (or minimal) evidence enters the training system. Oddly enough this is a single global layer and it seems to be working fine for Madagascar but not for Puget, San Pedro, or California.
For the Puget carbon model, zeroes in the data are being incorrectly interpreted as nodata.
There are overlap problems with "tiled" datasets (that aren't really square tiles but are designed to fit together seamlessly) like NLCD and NBCD that are used in training for some models, particularly the San Pedro carbon models. I encountered this problem separately as github issue #32 and have never been able to get a straight answer as to whether this is an easy or a hard problem to fix.
Some model -o statements, particularly for the San Pedro surface water sink and Madagascar carbon source models are giving an error I've never seen before - trying either model -o mgcsource14.nc core.models.carbon-mg/source core.contexts.mg/mg-mainland-simple OR model -o spwater14.nc core.models.water-san-pedro/infiltration-sink core.contexts.san-pedro/san-pedro-watershed-us-real yields the error "observation urn:uuid:ins:f442d9c2-19b2-4c61-b433-e9a09101fc74 has no observable" - what does this mean?
In trying to train the California flood sink models I get the error "Input/output error: usa:nlcd_4: I/O error reading image metadata!: url = http://ecoinformatics.uvm.edu/geoserver/wcs?service=WCS&version=1.0.0&request=GetCoverage&coverage=usa:nlcd_4&bbox=-118.19829941621056,33.230093820887824,-116.69965953631186,34.38727144967037&crs=EPSG:4326&responseCRS=EPSG:4326&width=158&height=122&format=geotiff". This is strange as I used NLCD data to successfully train the California carbon models and many others.

kbagstad commented 12 years ago

Well the numbering got messy when I posted this issue - if you look carefully you'll see 5 issues, not 4. Yes, despite my flaws as a programmer and modeler, I can at least count to 5.

fvilla commented 12 years ago

for (1) and (1), I really need a way to reproduce them. Particularly, for (1), thinklab does no such thing as "interpreting" numbers - I'm pretty sure that a look at the model will reveal where the interpretation is.

(2) is a very hard problem to fix, as it depends on the way information integrated after multiple warpings can still seamlessly overlap. Even if the projections don't change there is warping because we extract stuff in arbitrary resolutions. It is intimately related to how geoserver works and there isn't much to do from the thinklab side. Hope this is straight enough for now - the bottom line is, don't expect to see this solved soon.

for (3), I indeed forgot to mention that there is a subtle and obscure problem with the training messing up the model that are trained, and that will appear if you run a model in the same session where you have trained it, but should not appear after exiting and restarting. So if you try again (without having just run a 'train' command on those models) you should not see any problem. Let me know if not so.

(4) seems like a geoserver problem but the URL you pasted in is actually a request for a bounding box that doesn't intersect the requested data layer - which shouldn't happen as the overlap is checked beforehand. Can it be something very close to the edges of the layer in question and very small compared to the layer's extent? To the point that rounding to resolution ends up with a bbox that does not fall within the layer?

kbagstad commented 12 years ago

Thanks Ferd. I'm taking a closer look at the first #1 and may have figured it out - will let you know.

For the second #1, I could use a second pair of eyes. I meant to say the issue was with the fire threat class, and the problem is that real evidence is getting passed to the training algorithm as nodata. Try model -o firetest.nc core.models.carbon-puget/fire-threat core.contexts.puget/puget-watershed-complex. What you'll (very likely) get is an output that has nodatas contained within the context but where there were values of zero in the data itself, which should've been reclassed as lowfirethreat (the layer itself is on the raid at raid/geodata/coverages/puget/fire/Fire_WAOR.tif). The zeros are indeed true zeros, not the goofy nodata values Brian brought up an issue or two ago. I'm really stumped on this one though it's probably something obvious - the behavior should be very straightforward, but it's not.

kbagstad commented 12 years ago

OK, the first #1 was my mistake on discretization for the places where it wasn't working properly - fixed and can now train San Pedro carbon sink.

Still need a hand on the second #1 - this is preventing me from finalizing Puget carbon sink training, the highest priority part of this issue.

For #3, it seems like I am still running into some model -o problems post-training, even when I exit and re-pload. Might be worth trying a few times on your end to see if you can replicate.

For #4, this is a really strange problem that affects any NLCD data call by the California models (try model -o cacvttest.nc core.models.carbon-ca/land-use core.contexts.california/flood-watersheds for example). The weird thing is that it's looking for the nlcd_4 layer, which has no coverage of California and would explain the lack of overlap problem, but there's a perfectly good nlcd_2 layer that the system should find when it looks for NLCD data in California. I've played around quite a bit with nlcd_2 and as far as I can tell it seems like a good layer, so no idea why that modeling call is ignoring nlcd_2 and reaching for nlcd_4 which causes the overlap problem. Of course this will hopefully be solved when I get a student up and running on the new national NLCD layer, but for now remains a problem.

kbagstad commented 12 years ago

OK, more detail on #3 (a problem that's keeping me from running Puget flood or sediment models): running both

model -o floodsinktrained.nc core.models.flood-puget/sink-annual core.contexts.puget/wria-green and model -o sedsourcektrained.nc core.models.sediment-puget/source-puget core.contexts.puget/wria-lyre gives me the quite unhelpful error "null".

Tried wiping the jpf shadow, ant clean and reinstall, etc. with no luck. Also tried running on huginn with no luck. Quite frustrating.

kbagstad commented 12 years ago

OK, second #1 solved by adding a hasnodata tag to the .xml. #3 is a BN problem - I'll investigate more closely - but not a system problem. #4 remains a problem but would be fixed by taking care of the NLCD zoning problem, which will be a priority for my student. Closing this issue, Ferd, let's hit issue #53 tomorrow or friday if at all possible.

ariesteam / aries

the big training issue #60