nextgis / qgis_molusce

Modules for Land Use Change Simulations
https://github.com/nextgis/molusce
GNU General Public License v2.0
43 stars 15 forks source link

QGIS ANN simulation results show little to no changes #27

Open MarijaBogdanova opened 6 years ago

MarijaBogdanova commented 6 years ago

Hello! I seem to have a problem with simulation using ANN transition potential modelling. I have a two land cover maps from 2003 and from 2018 (both integer) and a set of factors (11 factors both integer and float). Problem is - with Logistic regression and WoE I do get more or less desired results (changes are simulated in all land cover classes), but with ANN changes are very limited, sometimes ignoring whole classes. I know it is supposed to 'learn' from past development of land covers. But I think in my case that it ignores the past development of the urban cover, urban sprawl basically, sprawl which continiues to happen and I expected that to show in the simulation for year 2033.

I tried changing the count of samples, count of hidden layers, learning rate, left it training overnight, but all the results are still looking very similar - with very little changes being simulated. What could be the problem? May be something wrong with my inputs? Here is a link to my inputs. https://www.dropbox.com/sh/iudnonedzuc5lrq/AAC8jZ01oJDuLHp24Oco1Ijca?dl=0

there are two inputs for initial and final years (2003/2018_smaller_int), the rest 11 are factors (like distance to roads/, rivers, tehnical infrastructure, schools, bus stops, as well as zoning, population location/count of registered firms)

Would be very thankful if you would try to see where the problem could be.

KolesovDmitry commented 6 years ago

Hello, Marija, could you please answer the next questions?

MarijaBogdanova commented 6 years ago

Hello, 1) as I thought that random sampling procedure didn't work for me, I used "all" sampling (every time I tried to use stratified, QGIS crashed, so that didn't work for me either) At some point I set the count of samples to 10 000 and 50 000, and the result was almost the same (no noticeable changes)

2) I will attach some learning curves in the same dropbox folder right away!

KolesovDmitry commented 6 years ago

Stratified sampling is preferable in case of unbalanced classes. If you use 'all' samples, then ANN might ignore some rare classes/transitions: prediction errors of rare transition don't add much to cumulative error. So I think the most important issue is about QGIS crash. Could you provide more information about it?

MarijaBogdanova commented 6 years ago
  1. The screenshots you asked for with stratified sampling are (partially) here: https://www.dropbox.com/sh/p3q0hksl7sew3zb/AAA-fr7typk4tFQGxS2WgWUna?dl=0 (partially, because QGIS freezes on transition potential modelling step and doesn't go any further)

I see why stratified is preferable. But 'all' samples worked in case with logistic regression (results attached in new link, I didn't make it up!).

  1. The PC I used for simulations with 'all' (and failed to use with 'stratified') has Windows10 with 16gb RAM (screenshot of other parameters also included in the link) Так же пыталась все то же самое (с тем же результатом) сделать на ноутбуке (тоже Винда с 16 гб РАМ, чуть мощнее процессор, та же версия QGIS и Молюска)
KolesovDmitry commented 6 years ago

I made some investigations and see two problems:

  1. There is a bug that causes infinite loops in particular cases of input data (I created issue #28 about it). For example (a) your data has only one pixel for transition Class3->Class7; (b) this pixel lies on raster boundary (see coordinates (519553.5, 298344.4)) (c) you ask ANN analyse not pixels only, but neighbors of the pixels also (see the neighbourhood parameter) So sampling procedure can't create sample from the data => it falls in infinite loop. This is a bug, it has to be fixed. But in your case I would advise remove this pixel from your data because one pixel isn't enough to create appropriate statistics.
  2. The second question is about why ANN can't create tolerable model, but LR can do it (in case of 'all' sampling). There are a lot of possible causes and some investigations are needed. (Если хотите, на пишите мне напрямую на почту, так будет быстрее, чем тут)
Pujasatya commented 2 years ago

Have you resolved the issue . If yes i am facing the same problem.