JamesBremner / so77307162

Group graph vertices according to attributes.
1 stars 0 forks source link

read test data sets from text files #1

Closed JamesBremner closed 1 year ago

JamesBremner commented 1 year ago

vertices = dict([('g0-v0', {'coords': (289505.29113884346, 3809582.8002458056, 3.4325371831655502)}), ('g0-v1', {'coords': (289505.89137526846, 3809584.4667251427, 3.4499223306775093)}), ('g0-v2', {'coords': (289504.4185388949, 3809584.992423441, 4.262407331727445)}), ('g0-v3', {'coords': (289502.92465452605, 3809585.5074766655, 3.3520891135558486)}), ('g0-v4', {'coords': (289502.1874815416, 3809584.046727399, 3.4383309045806527)}), ('g0-v5', {'coords': (289503.2098872459, 3809581.929851732, 4.259595467709005)}), ('g0-v7', {'coords': (289505.77128353063, 3809584.1539462083, 2.776375670917332)}), ('g0-v9', {'coords': (289502.6837983986, 3809585.4038596363, 2.815225132741034)}), ('g1-v0', {'coords': (289509.6128579881, 3809575.3195456313, 2.2534847604110837)}), ('g1-v1', {'coords': (289510.70581039955, 3809577.783933704, 3.5839293217286468)}), ('g1-v2', {'coords': (289511.4035359267, 3809579.731115867, 3.7048939522355795)}), ('g1-v3', {'coords': (289512.40618138015, 3809582.4432819095, 2.334692606702447)}), ('g1-v4', {'coords': (289501.51873695327, 3809586.4752561618, 2.51658580545336)}), ('g1-v5', {'coords': (289500.5641674058, 3809583.921679454, 3.831248564645648)}), ('g1-v6', {'coords': (289500.13612185075, 3809582.9820827004, 3.7128401873633265)}), ('g1-v7', {'coords': (289498.77303475, 3809579.366229753, 2.283148379996419)}), ('g1-v8', {'coords': (289500.88263875083, 3809578.604004119, 2.2721760366111994)}), ('g1-v9', {'coords': (289500.9781923641, 3809578.7744675647, 2.419397037476301)}), ('g1-v10', {'coords': (289500.0446753629, 3809579.125590793, 2.4324609553441405)}), ('g1-v11', {'coords': (289501.4418613444, 3809582.501332389, 4.319459212012589)}), ('g1-v12', {'coords': (289509.73036577494, 3809579.347299192, 4.246670736931264)}), ('g1-v13', {'coords': (289508.06265764986, 3809576.1304689297, 2.372382231988013)}), ('g1-v14', {'coords': (289507.5055342968, 3809576.3570000282, 2.2798113031312823)}), ('g1-v15', {'coords': (289507.40741197416, 3809576.0942560057, 2.224258623085916)}), ('g1-v18', {'coords': (289508.27952180186, 3809575.782694147, 3.432461282238364)}), ('g1-v19', {'coords': (289500.0481945323, 3809578.878560306, 3.4992404812946916)}), ('g1-v22', {'coords': (289499.70789473713, 3809581.873367534, 3.6957009537145495)}), ('g1-v25', {'coords': (289510.0618469392, 3809579.1657374133, 4.2692895056679845)}), ('g1-v28', {'coords': (289508.9101185086, 3809579.659428844, 4.253872682340443)}), ('g1-v29', {'coords': (289507.88647800055, 3809581.7532236963, 3.4469804102554917)}), ('g1-v30', {'coords': (289508.7267558488, 3809583.256266117, 2.686913950368762)}), ('g1-v31', {'coords': (289511.3417801809, 3809582.032448787, 2.765864943154156)}), ('g1-v32', {'coords': (289510.9850381132, 3809580.6364425477, 3.4139089938253164)}), ('g1-v33', {'coords': (289511.58398212894, 3809582.2796163415, 3.448804317973554)}), ('g1-v34', {'coords': (289510.08774438093, 3809582.853807372, 4.274466275237501)}), ('g1-v35', {'coords': (289508.64613766386, 3809583.3918638965, 3.494000982493162)}), ('g1-v36', {'coords': (289501.20877576683, 3809582.608069639, 4.32090550288558)}), ('g2-v0', {'coords': (289483.08079936163, 3809581.081368443, 1.6217583119869232)}), ('g2-v1', {'coords': (289483.6255049363, 3809579.6821401836, 1.6030659694224596)}), ('g2-v2', {'coords': (289483.50581311376, 3809579.4650836852, 1.4618312260136008)}), ('g2-v3', {'coords': (289484.01704586693, 3809578.3094069315, 1.4382868381217122)}), ('g2-v4', {'coords': (289484.29531448317, 3809578.236822284, 1.5649760030210018)}), ('g2-v6', {'coords': (289484.8317422634, 3809577.052280114, 1.5955992126837373)}), ('g2-v7', {'coords': (289485.46987723326, 3809576.788683092, 1.773581468500197)}), ('g2-v8', {'coords': (289485.7199956748, 3809577.341426862, 2.0312139401212335)}), ('g2-v9', {'coords': (289486.15113476827, 3809577.225022448, 2.197777848690748)}), ('g2-v12', {'coords': (289486.5945713894, 3809577.0625013555, 2.0207358561456203)}), ('g2-v14', {'coords': (289486.4019410508, 3809576.48708044, 1.7745313439518213)}), ('g2-v17', {'coords': (289487.0800194229, 3809576.298943132, 1.4836841635406017)}), ('g2-v18', {'coords': (289490.87378455023, 3809577.949905314, 1.574909107759595)}), ('g2-v19', {'coords': (289490.8696355417, 3809578.1947361683, 1.7066296795383096)}), ('g2-v20', {'coords': (289491.51085372467, 3809576.809636472, 1.6801795084029436)}), ('g2-v21', {'coords': (289493.21269875986, 3809576.1465115435, 1.6421171138063073)}), ('g2-v22', {'coords': (289494.905870488, 3809576.9526914097, 1.698048540391028)}), ('g2-v23', {'coords': (289495.5369774403, 3809578.596833082, 1.746348861604929)}), ('g2-v24', {'coords': (289494.93250684463, 3809580.02379859, 1.7478158818557858)}), ('g2-v25', {'coords': (289495.1349019399, 3809579.8813362634, 1.6027319263666868)}), ('g2-v26', {'coords': (289496.1180691675, 3809580.3497205866, 1.6008373526856303)}), ('g2-v27', {'coords': (289499.3717700927, 3809579.1061575883, 1.5679800547659397)}), ('g2-v28', {'coords': (289499.36769331107, 3809579.2163494034, 1.7064487617462873)}), ('g2-v29', {'coords': (289498.81283589103, 3809579.5159336245, 1.6720591699704528)}), ('g2-v30', {'coords': (289499.2840574782, 3809580.8404292674, 2.3972666896879673)}), ('g2-v31', {'coords': (289500.200616307, 3809582.9837737014, 1.3126656645908952)}), ('g2-v35', {'coords': (289498.44473032665, 3809583.564916649, 1.3485598219558597)}), ('g2-v36', {'coords': (289495.405867854, 3809582.3059755997, 2.493828653357923)}), ('g2-v37', {'coords': (289496.05602186895, 3809581.0785058863, 1.946759465150535)}), ('g2-v39', {'coords': (289494.0510869492, 3809584.389650086, 2.451877100393176)}), ('g2-v40', {'coords': (289494.5430124765, 3809582.4082830907, 2.940911915153265)}), ('g2-v42', {'coords': (289491.9326455266, 3809581.2299804497, 2.8697928115725517)}), ('g2-v43', {'coords': (289491.1402483041, 3809580.938029969, 2.9008032036945224)}), ('g2-v44', {'coords': (289489.6917326223, 3809581.4136777413, 3.4179105162620544)}), ('g2-v45', {'coords': (289487.21156211715, 3809580.3759020353, 3.380049974657595)}), ('g2-v46', {'coords': (289486.03564716066, 3809579.861862689, 2.7432506876066327)}), ('g2-v47', {'coords': (289484.6484217647, 3809580.3082217597, 2.2071687383577228)}), ('g2-v48', {'coords': (289484.1550203264, 3809581.6248097457, 2.1427575033158064)}), ('g2-v54', {'coords': (289493.1637555723, 3809578.5549829057, 2.824695447459817)}), ('g2-v64', {'coords': (289494.47331879125, 3809582.756890577, 3.0563039276748896)}), ('g3-v0', {'coords': (289479.01320350706, 3809592.3634230834, 2.4333399906754494)}), ('g3-v1', {'coords': (289478.83181306015, 3809591.727689541, 2.386751656420529)}), ('g3-v2', {'coords': (289479.3731438945, 3809590.5262232274, 2.3455128464847803)}), ('g3-v3', {'coords': (289479.92867658363, 3809590.279055865, 2.3353928970173)}), ('g3-v4', {'coords': (289479.6602040716, 3809590.858248691, 2.67400826420635)}), ('g3-v5', {'coords': (289479.26298549067, 3809591.739514166, 2.6815256625413895)}), ('g4-v0', {'coords': (289482.4651876502, 3809590.9737581643, 3.553054158575833)}), ('g4-v1', {'coords': (289484.6317295567, 3809585.9617136447, 6.017564969137311)}), ('g4-v2', {'coords': (289482.2090670831, 3809584.819650307, 6.000647897832096)}), ('g4-v3', {'coords': (289483.376433263, 3809582.178296351, 7.293851716443896)}), ('g4-v4', {'coords': (289484.5190009417, 3809579.8234195644, 5.9166597835719585)}), ('g4-v5', {'coords': (289486.1551758555, 3809579.40342302, 5.3667474957183)}), ('g4-v6', {'coords': (289489.7511207877, 3809580.9700859264, 5.380998072214425)}), ('g4-v8', {'coords': (289491.3410344729, 3809580.540291847, 4.840354504995048)}), ('g4-v9', {'coords': (289494.6870825094, 3809582.003593931, 4.812238761223853)}), ('g4-v10', {'coords': (289495.3259429523, 3809583.6550926864, 5.485052427276969)}), ('g4-v11', {'coords': (289493.8644936806, 3809586.949030804, 7.381379863247275)}), ('g4-v12', {'coords': (289492.6835496159, 3809589.66111089, 5.927439618855715)}), ('g4-v13', {'coords': (289490.1152518763, 3809590.677952048, 5.049306926317513)}), ('g4-v14', {'coords': (289488.37251672684, 3809589.906938164, 5.069095809943974)}), ('g4-v15', {'coords': (289486.88625800883, 3809593.25655928, 3.51719784270972)}), ('g4-v17', {'coords': (289488.19183841767, 3809589.6033270825, 3.604136047884822)}), ('g4-v18', {'coords': (289490.19329013163, 3809590.301819089, 2.5490759778767824)}), ('g4-v19', {'coords': (289489.08249117294, 3809592.7618413847, 2.586450886912644)}), ('g4-v20', {'coords': (289489.619213288, 3809592.9973884057, 2.571209759451449)}), ('g4-v21', {'coords': (289488.29490780295, 3809595.938866629, 4.23842316865921)}), ('g4-v22', {'coords': (289486.92887169984, 3809598.9218659946, 2.657242518849671)}), ('g4-v23', {'coords': (289486.36400955205, 3809598.6701575397, 2.640425452031195)}), ('g4-v24', {'coords': (289485.290990395, 3809598.1698200023, 3.273051689378917)}), ('g4-v26', {'coords': (289484.4860255022, 3809599.9076789366, 3.3046739120036364)}), ('g4-v27', {'coords': (289483.8230472922, 3809599.5909538963, 3.7548816334456205)}), ('g4-v28', {'coords': (289483.20636091067, 3809599.341622821, 3.745818270370364)}), ('g4-v29', {'coords': (289483.4985634192, 3809598.6911704177, 4.0851341392844915)}), ('g4-v30', {'coords': (289482.6250518036, 3809598.294584799, 4.030401179566979)}), ('g4-v31', {'coords': (289482.3724584045, 3809598.896244792, 3.734459549188614)}), ('g4-v32', {'coords': (289480.450362809, 3809598.0225363574, 3.709920003078878)}), ('g4-v33', {'coords': (289480.2048526807, 3809597.8760972507, 3.488342931494117)}), ('g4-v34', {'coords': (289481.28022064234, 3809595.4685293236, 3.461932540871203)}), ('g4-v35', {'coords': (289479.8172326384, 3809594.830899892, 2.693051228299737)}), ('g4-v37', {'coords': (289479.6995939894, 3809594.4092881894, 2.587766961194575)}), ('g4-v38', {'coords': (289479.7613328197, 3809594.289845235, 2.6287123765796423)}), ('g4-v39', {'coords': (289478.2051561482, 3809593.536503854, 2.582464028149843)}), ('g4-v40', {'coords': (289479.2954733621, 3809591.210564432, 3.866884051822126)}), ('g4-v42', {'coords': (289480.40739325905, 3809588.952592213, 2.568576207384467)}), ('g4-v43', {'coords': (289481.90547528723, 3809589.5936430045, 2.5431167567148805)}), ('g4-v44', {'coords': (289480.44008478365, 3809589.1119077955, 1.9394174329936504)}), ('g4-v46', {'coords': (289480.17402376037, 3809589.6812003707, 1.945013415068388)}), ('g4-v47', {'coords': (289479.74632796523, 3809589.463101419, 1.6615103986114264)}), ('g4-v48', {'coords': (289482.23655559774, 3809584.150537456, 1.5518957627937198)}), ('g4-v49', {'coords': (289482.79481579113, 3809584.355787751, 1.8986105006188154)}), ('g4-v50', {'coords': (289482.6484146575, 3809584.757735084, 1.9596313275396824)}), ('g4-v51', {'coords': (289485.1917839021, 3809585.764020973, 3.1308537758886814)}), ('g4-v52', {'coords': (289482.51720548247, 3809591.006972457, 3.1112184803932905)}), ('g4-v53', {'coords': (289483.4197953283, 3809591.392162719, 3.5575841683894396)}), ('g4-v55', {'coords': (289485.94348135585, 3809590.391592346, 4.488903337158263)}), ('g4-v59', {'coords': (289482.91145967756, 3809597.2189652026, 4.63420370221138)}), ('g4-v63', {'coords': (289484.813967597, 3809594.419773257, 4.238473925739527)}), ('g4-v66', {'coords': (289483.40592294093, 3809593.0233782884, 3.927971590310335)})])

edges = [('g0-v0', 'g0-v1'), ('g0-v0', 'g0-v5'), ('g0-v0', 'g0-v7'), ('g0-v1', 'g0-v2'), ('g0-v2', 'g0-v3'), ('g0-v2', 'g0-v5'), ('g0-v3', 'g0-v4'), ('g0-v4', 'g0-v5'), ('g0-v4', 'g0-v9'), ('g0-v7', 'g0-v9'), ('g1-v0', 'g1-v1'), ('g1-v0', 'g1-v15'), ('g1-v1', 'g1-v2'), ('g1-v1', 'g1-v25'), ('g1-v2', 'g1-v3'), ('g1-v2', 'g1-v25'), ('g1-v3', 'g1-v4'), ('g1-v4', 'g1-v5'), ('g1-v5', 'g1-v6'), ('g1-v5', 'g1-v36'), ('g1-v6', 'g1-v22'), ('g1-v7', 'g1-v8'), ('g1-v7', 'g1-v22'), ('g1-v8', 'g1-v9'), ('g1-v9', 'g1-v10'), ('g1-v10', 'g1-v11'), ('g1-v11', 'g1-v19'), ('g1-v11', 'g1-v36'), ('g1-v11', 'g1-v28'), ('g1-v12', 'g1-v13'), ('g1-v12', 'g1-v18'), ('g1-v12', 'g1-v25'), ('g1-v12', 'g1-v28'), ('g1-v13', 'g1-v14'), ('g1-v14', 'g1-v15'), ('g1-v18', 'g1-v19'), ('g1-v22', 'g1-v36'), ('g1-v28', 'g1-v29'), ('g1-v28', 'g1-v34'), ('g1-v28', 'g1-v32'), ('g1-v29', 'g1-v30'), ('g1-v29', 'g1-v35'), ('g1-v30', 'g1-v31'), ('g1-v31', 'g1-v32'), ('g1-v32', 'g1-v33'), ('g1-v33', 'g1-v34'), ('g1-v34', 'g1-v35'), ('g2-v0', 'g2-v1'), ('g2-v0', 'g2-v48'), ('g2-v1', 'g2-v2'), ('g2-v1', 'g2-v4'), ('g2-v2', 'g2-v3'), ('g2-v3', 'g2-v4'), ('g2-v4', 'g2-v6'), ('g2-v6', 'g2-v7'), ('g2-v7', 'g2-v8'), ('g2-v8', 'g2-v9'), ('g2-v9', 'g2-v12'), ('g2-v9', 'g2-v45'), ('g2-v12', 'g2-v14'), ('g2-v14', 'g2-v17'), ('g2-v17', 'g2-v18'), ('g2-v18', 'g2-v19'), ('g2-v19', 'g2-v20'), ('g2-v19', 'g2-v42'), ('g2-v20', 'g2-v21'), ('g2-v20', 'g2-v54'), ('g2-v21', 'g2-v22'), ('g2-v21', 'g2-v54'), ('g2-v22', 'g2-v23'), ('g2-v22', 'g2-v54'), ('g2-v23', 'g2-v24'), ('g2-v23', 'g2-v54'), ('g2-v24', 'g2-v25'), ('g2-v24', 'g2-v42'), ('g2-v25', 'g2-v26'), ('g2-v26', 'g2-v27'), ('g2-v26', 'g2-v37'), ('g2-v27', 'g2-v28'), ('g2-v28', 'g2-v29'), ('g2-v29', 'g2-v30'), ('g2-v30', 'g2-v31'), ('g2-v30', 'g2-v36'), ('g2-v31', 'g2-v35'), ('g2-v35', 'g2-v36'), ('g2-v35', 'g2-v64'), ('g2-v35', 'g2-v39'), ('g2-v36', 'g2-v37'), ('g2-v37', 'g2-v40'), ('g2-v39', 'g2-v64'), ('g2-v40', 'g2-v42'), ('g2-v40', 'g2-v64'), ('g2-v42', 'g2-v43'), ('g2-v42', 'g2-v54'), ('g2-v43', 'g2-v44'), ('g2-v44', 'g2-v45'), ('g2-v45', 'g2-v46'), ('g2-v46', 'g2-v47'), ('g2-v47', 'g2-v48'), ('g3-v0', 'g3-v1'), ('g3-v0', 'g3-v5'), ('g3-v1', 'g3-v2'), ('g3-v2', 'g3-v3'), ('g3-v3', 'g3-v4'), ('g3-v4', 'g3-v5'), ('g4-v0', 'g4-v1'), ('g4-v0', 'g4-v53'), ('g4-v1', 'g4-v2'), ('g4-v2', 'g4-v3'), ('g4-v3', 'g4-v4'), ('g4-v3', 'g4-v11'), ('g4-v4', 'g4-v5'), ('g4-v5', 'g4-v6'), ('g4-v6', 'g4-v8'), ('g4-v6', 'g4-v10'), ('g4-v8', 'g4-v9'), ('g4-v9', 'g4-v10'), ('g4-v10', 'g4-v11'), ('g4-v11', 'g4-v12'), ('g4-v12', 'g4-v13'), ('g4-v13', 'g4-v14'), ('g4-v14', 'g4-v15'), ('g4-v15', 'g4-v17'), ('g4-v15', 'g4-v55'), ('g4-v17', 'g4-v18'), ('g4-v18', 'g4-v19'), ('g4-v19', 'g4-v20'), ('g4-v19', 'g4-v63'), ('g4-v20', 'g4-v21'), ('g4-v21', 'g4-v22'), ('g4-v21', 'g4-v63'), ('g4-v22', 'g4-v23'), ('g4-v23', 'g4-v24'), ('g4-v23', 'g4-v63'), ('g4-v24', 'g4-v26'), ('g4-v26', 'g4-v27'), ('g4-v27', 'g4-v28'), ('g4-v27', 'g4-v59'), ('g4-v28', 'g4-v29'), ('g4-v29', 'g4-v30'), ('g4-v30', 'g4-v31'), ('g4-v31', 'g4-v32'), ('g4-v32', 'g4-v33'), ('g4-v32', 'g4-v59'), ('g4-v33', 'g4-v34'), ('g4-v34', 'g4-v35'), ('g4-v35', 'g4-v37'), ('g4-v37', 'g4-v38'), ('g4-v37', 'g4-v66'), ('g4-v38', 'g4-v39'), ('g4-v39', 'g4-v40'), ('g4-v40', 'g4-v42'), ('g4-v40', 'g4-v66'), ('g4-v42', 'g4-v43'), ('g4-v43', 'g4-v44'), ('g4-v43', 'g4-v66'), ('g4-v44', 'g4-v46'), ('g4-v46', 'g4-v47'), ('g4-v47', 'g4-v48'), ('g4-v48', 'g4-v49'), ('g4-v49', 'g4-v50'), ('g4-v50', 'g4-v51'), ('g4-v51', 'g4-v52'), ('g4-v52', 'g4-v53'), ('g4-v53', 'g4-v55'), ('g4-v55', 'g4-v59')]

JamesBremner commented 1 year ago

@remidelattre Please add your description of this format here. I have a number of questions for you.

  1. Why do the vertices have three dimensions? The image you posted to SO shows 2D.
remidelattre commented 1 year ago

In my dataset, the first variable gives you the id of each polygon. de gives you the amount that they have. The rest is irrelevant to this task. I dont really know why verticles have three dimensions : there should only be an x and a y coordonate. Opening the shape with QGIS shows vector polygons. No 3D.

I have a question for you : in the example picture you provided, it seems like the groups are interconnected, or perhaps I am misunderstanding it. My idea was that when a group reaches -0,5 to 0,5, and if we have more than 5 entities, we create another group. What we have noticed is that most entities have a negative value in de, and its impossible to have every entity in a group because of that. Also, when we picked a random entity to start the group-making process, the results of every iteration would vary a lot, because depending on where the algorithm starts, its output will be different. Hence, we've decided to always start with the locality that has the highest value for "de". Then this locality tries to help its neighbors reaching 0, favouring neighbors that have reasonably low deficits to maximize the number of localities in a group. Thoughts on that method ?

remidelattre commented 1 year ago

also, here is an example of how this would work in practice : https://ibb.co/RQNyW38

remidelattre commented 1 year ago

By first listing all localities within 4 degrees of adjacency, then trying to maximize "de" using a contiguous path, then handing out the surpluses to the localities with the lowest deficits, we probably maximize group size : https://ibb.co/Kq1B4HY (this is me trying to manually form a group, "de" is probably not in fact comprised between -0,5 and 0,5 here.) Also there are mistakes in assessing adjacency degrees, hence why the need for an algorithm 🗡️

JamesBremner commented 1 year ago

In my dataset, de gives you the amount that they have.

Looking at your dataset ( I have posted it in the first post to this issue, above ) there is no 'de' anywhere that I can see.

JamesBremner commented 1 year ago

I dont really know why verticles have three dimensions : there should only be an x and a y coordonate.

Well, if you do not know, how can you expect me to understand and use your dataset?

JamesBremner commented 1 year ago

I would really like to demonstrate my method of assigning localities to groups using one of your datasets.

Please post a dataset that gives me the locality x and y coordinates, the locality values and the locality indices.

remidelattre commented 1 year ago

Using the dataset I provided on stack, the unique id is in the "insee" row (the first row of the .dbf file). The shapefile stores the x and y, but they are not explicit. I'll try to create dbf rows with the x and y's, but you dont need them if you read the shapefiles (I'm not sure you can do that with c++, does work with python though). By locality indices I assume you mean "de", which is another row in the .dbf.

JamesBremner commented 1 year ago

Thank you for explaining this.

Something must have gone wrong when I downloaded you file yesterday - there was no .dbf file.

Just now I tried again to download and this time got better results

image

remidelattre commented 1 year ago

Alright, sorry for the inconvenience. I just realized something : my data comes in the form of multipolygons. If i convert them to points and add x and y coords, you lose the knowledge of which ones are actually adjacent, because one set of x/y coordinates does not tell you the full geography of a polygon. I'm trying to create polylines between my points so you can have the coords while also having the polylines.

JamesBremner commented 1 year ago

It is not helpful to share data using binary file formats.

I have tried converting you dbf file to csv using https://www.dbf2002.com/dbf-converter/

However, that will only allow me to convert 50 records.

Please can you convert your data to a CSV text file and post that.

JamesBremner commented 1 year ago

Polygons are usually specified by listing the coords of their vertices in a defined order ( clockwise or anti-clockwise )

So a unit square might be "0,0,0,1,1,1,1,0"

Please post your locality polygons like this.

remidelattre commented 1 year ago

Here's a CSV of the same dataset. You can convert a .dbf to a .csv file using excel or calc. Encoding in UTF-8 is the best solution, though as you'll see, some localities names are wrongly processed (this can be fixed using QGIS). Your solution for listing the coords of the verticles is interesting, except my multipolygons have 4100 summits on average, making it highly unpractical. Will update the shapefile and the corresponding csv with coords once I find a solution. csv : https://file.io/mwdaTIDPcyZG to recap : insee is the id of every locality (their geometry is in the shapefile) de : value that we're trying to balance across groups

JamesBremner commented 1 year ago

Here we are:

insee,nom,wikipedia,surf_ha,de,INSEE_REG
2B222,Pie-d'Orezza,fr:Pie-d'Orezza,573.000000000000000,1.000000000000000,94
2B137,Lano,fr:Lano,824.000000000000000,1.000000000000000,94
2B051,Cambia,fr:Cambia,833.000000000000000,1.000000000000000,94
2B106,Érone,fr:Érone,393.000000000000000,1.000000000000000,94
2B185,Oletta,fr:Oletta,2674.000000000000000,-28.144484624975900,94
2B058,Canari,fr:Canari (Haute-Corse),1678.000000000000000,-0.594036587198047,94
2B188,Olmeta-di-Tuda,fr:Olmeta-di-Tuda,1753.000000000000000,-6.432021336520920,94
2B052,Campana,fr:Campana,236.000000000000000,1.000000000000000,94
2B063,Carcheto-Brustico,fr:Carcheto-Brustico,525.000000000000000,1.000000000000000,94
2B015,Ampriani,fr:Ampriani,230.000000000000000,1.027900000000000,94
2B213,Pianello,fr:Pianello,1677.000000000000000,1.000000000000000,94
2B364,Zuani,fr:Zuani,518.000000000000000,1.000000000000000,94
2B226,Pietraserena,fr:Pietraserena,678.000000000000000,1.090677029836480,94
2B221,Piedipartino,fr:Piedipartino,326.000000000000000,1.000000000000000,94
66113,Montbolo,fr:Montbolo,2230.000000000000000,2.136850000000000,76
66202,Targasonne,fr:Targasonne,787.000000000000000,1.130500000000000,76
66001,L'Albère,fr:L'Albère,1620.000000000000000,0.995370629370629,76
66117,Mont-Louis,fr:Mont-Louis (Pyrénées-Orientales),38.000000000000000,1.000000000000000,76
66072,Estavar,fr:Estavar,926.000000000000000,-19.660138808571400,76
66064,Égat,fr:Égat,451.000000000000000,1.008514114832530,76
66063,Les Cluses,fr:Les Cluses,879.000000000000000,1.556700000000000,76
66027,La Cabanasse,fr:La Cabanasse,317.000000000000000,2.040817051732790,76
66123,Nyer,fr:Nyer,3640.000000000000000,-2.107813205098700,76
66066,Enveitg,fr:Enveitg,3039.000000000000000,-3.095284413743880,76
2B229,Pietroso,fr:Pietroso,2586.000000000000000,0.114870862732372,94
2A269,Sari-Solenzara,fr:Sari-Solenzara,7448.000000000000000,-14.631546768086300,94
2A092,Conca,fr:Conca,7846.000000000000000,-11.859470906057300,94
2B121,Galéria,fr:Galéria,13592.000000000000000,0.973574701548631,94
2B095,Corscia,fr:Corscia,5876.000000000000000,1.295900000000000,94
2B153,Manso,fr:Manso,12103.000000000000000,0.262231356946367,94
66003,Amélie-les-Bains-Palalda,fr:Amélie-les-Bains-Palalda,2932.000000000000000,1.327592052849530,76
66106,Maureillas-las-Illas,fr:Maureillas-las-Illas,4235.000000000000000,2.232761507059700,76
66137,Le Perthus,fr:Le Perthus,423.000000000000000,0.980721794871795,76
66194,Serralongue,fr:Serralongue,2267.000000000000000,1.059850000000000,76
66179,Saint-Laurent-de-Cerdans,fr:Saint-Laurent-de-Cerdans,4482.000000000000000,1.118818093385210,76
66009,Arles-sur-Tech,fr:Arles-sur-Tech,2848.000000000000000,3.018029431474090,76
66060,Corsavy,fr:Corsavy,4765.000000000000000,1.000000000000000,76
66116,Montferrer,fr:Montferrer,2175.000000000000000,1.095200000000000,76
66206,Le Tech,fr:Le Tech,2613.000000000000000,1.000000000000000,76
66150,Prats-de-Mollo-la-Preste,fr:Prats-de-Mollo-la-Preste,11964.000000000000000,-4.967635443641400,76
66102,Mantet,fr:Mantet,3254.000000000000000,0.177247912482831,76
66155,Py,fr:Py (Pyrénées-Orientales),4997.000000000000000,1.025000000000000,76
66080,Fontpédrouse,fr:Fontpédrouse,6427.000000000000000,1.105651145311380,76
66142,Planès,fr:Planès,1441.000000000000000,0.795061976676205,76
66188,Saint-Pierre-dels-Forcats,fr:Saint-Pierre-dels-Forcats,1298.000000000000000,-0.543096279523108,76
66181,Sainte-Léocadie,fr:Sainte-Léocadie,884.000000000000000,-6.772902397415980,76
66025,Bourg-Madame,fr:Bourg-Madame,770.000000000000000,-6.007165231944400,76
66100,Llo,fr:Llo,2864.000000000000000,0.746754468180079,76
66075,Eyne,fr:Eyne,2044.000000000000000,0.976217181739371,76
66160,Reynès,fr:Reynès,2761.000000000000000,2.980374422060030,76
JamesBremner commented 1 year ago

So, now, all I need is the corresponding polygons

my multipolygons have 4100 summits on average,

What is a "summit"? Is it the same as a vertex?

JamesBremner commented 1 year ago

my data comes in the form of multipolygons.

I do not know what a "multipolygon" is.

JamesBremner commented 1 year ago

Perhaps you could post one locality?

I can take a look at it and come up with some preprocessing that removes redundant information.

We only need to do preprocessing once ( I doubt the localities change shape very often ) so it doesn't matter much how long it takes.

remidelattre commented 1 year ago

Alright, I have the geometry figured out in ASCII representation. This is the closest thing to a polygon geometry. I have tried to convert the polygons to points : the ensuing file was 18 600 000 entries long. A layer is a set of points, so theres no single x,y,z coord, unless you take the center. Which you could do, but then you lose the notion of adjacency..the data with wkt is quite big, therefore i'll just send you a limited amount of entities.

Here is an intersting read on the different ESRI shapefile types https://gis.stackexchange.com/questions/225368/understanding-difference-between-polygon-and-multipolygon-for-shapefiles-in-qgis This is how I got the WKT coords https://gis.stackexchange.com/questions/8844/getting-list-of-coordinates-for-points-in-layer-using-qgis/8911#8911 This is the link to the file https://file.io/AsA8mDwxb2ZE

JamesBremner commented 1 year ago

Thank you.

I will convert this from excel binary to csv text and write some code to read it.

JamesBremner commented 1 year ago

Here are the results of the initial parsing of the data file

666027.14887595619075 6155106....75 6155106.7352640014141798
66113,Montbolo,fr:Montbolo,2230,2.13685,76

The numbers in the first line specify the vertices of the locality 66113 is the locality index 2.13685 is the value of the locality

That seems fine for a start.

However many of the line in the file look like this:

1221914.16772219375707209 6160...9.41748260520398617, 1222562.7
,,,,,,

That is, the details of locality index and value are empty.

What should I make of these lines? Ignore them?

remidelattre commented 1 year ago

I think whats happening is that the lines are so long they overlap and create bugs. Here is an example of a single cell :

_MultiPolygon (((1220731.03914475860074162 6163639.56559936609119177, 1220739.76265204604715109 6163644.28344513103365898, 1220755.39198782620951533 6163653.68498235382139683, 1220761.94441876676864922 6163659.67969244346022606, 1220774.77590833627618849 6163671.98204784467816353, 1220797.10135718109086156 6163694.86811114102602005, 1220812.99888525833375752 6163710.20687372982501984, 1220836.62846141401678324 6163732.30479677673429251, 1220872.70274045411497355 6163786.98816235270351171, 1220888.91903544683009386 6163810.72378411889076233, 1220889.34113059192895889 6163811.65056689642369747, 1220891.08347177132964134 6163832.21579181030392647, 1220894.94395803287625313 6163869.69311805535107851, 1220904.04230083082802594 6163922.32381283119320869, 1220908.84586746548302472 6163948.04569607134908438, 1220922.19430503714829683 6164002.35681462939828634, 1220927.44054448162205517 6164021.52899059467017651, 1220932.76559608406387269 6164030.43898897152394056, 1220940.2700562602840364 6164042.09101320896297693, 1220956.48745900625362992 6164066.83144382666796446, 1220965.20039287814870477 6164079.919852907769382, 1220969.80792649881914258 6164086.42840434238314629, 1220973.63784947083331645 6164092.31648165080696344, 1220997.705375284422189 6164132.64334410894662142, 1221002.58826722297817469 6164139.84370685089379549, 1221036.44879575143568218 6164203.50278901867568493, 1221055.42587067186832428 6164238.17543965484946966, 1221094.26373984618112445 6164306.81056968122720718, 1221107.18733397801406682 6164329.27795722614973783, 1221146.81667236378416419 6164401.43701763451099396, 1221174.23933748924173415 6164453.53024357371032238, 1221189.56755841872654855 6164482.10643344838172197, 1221197.0195834063924849 6164495.42879414651542902, 1221206.30332987289875746 6164437.50980930402874947, 1221221.46326803369447589 6164377.91963471099734306, 1221226.35804975358769298 6164368.89242135733366013, 1221266.2559926297981292 6164300.45093995984643698, 1221305.26737088593654335 6164232.65272753871977329, 1221351.45749624003656209 6164154.80546964984387159, 1221366.76099071837961674 6164103.02872530557215214, 1221373.38898291019722819 6164085.20042574312537909, 1221383.7005041460506618 6164061.17199473828077316, 1221405.59056217037141323 6164011.35294552147388458, 1221431.63263213215395808 6163949.29957478865981102, 1221471.79715194180607796 6163812.59396978188306093, 1221504.57252533128485084 6163700.00636865384876728, 1221563.18794950330629945 6163653.88562374375760555, 1221617.64678089297376573 6163611.51647269353270531, 1221695.47307543316856027 6163549.49477237649261951, 1221801.66343848733231425 6163466.41462834924459457, 1221841.53837576531805098 6163434.6280377758666873, 1221898.40236468845978379 6163390.07673998642712831, 1221904.02932778722606599 6163394.91560098994523287, 1221919.53404012997634709 6163410.08057297579944134, 1221971.66158801084384322 6163462.63473416958004236, 1222000.25261072674766183 6163491.1298656053841114, 1222026.84779369598254561 6163518.3038423964753747, 1222047.08049137657508254 6163538.15771725960075855, 1222065.40269891358911991 6163541.32765458151698112, 1222072.17076010536402464 6163541.74913960229605436, 1222076.30371858947910368 6163541.70200516097247601, 1222081.09285281482152641 6163540.90400428604334593, 1222090.91958873486146331 6163538.37925554253160954, 1222100.71882672375068069 6163534.85892386920750141, 1222105.31123340595513582 6163534.34648313373327255, 1222109.11839468427933753 6163534.44059234671294689, 1222118.21655331319198012 6163535.83083209488540888, 1222121.11645417008548975 6163535.7850131718441844, 1222127.6830825605429709 6163535.00722159817814827, 1222135.0489240875467658 6163533.75795557536184788, 1222144.89711274160072207 6163533.33339929115027189, 1222150.75993804121389985 6163532.76690443605184555, 1222153.26868768851272762 6163532.13155760895460844, 1222157.68731957860291004 6163530.07603126764297485, 1222166.77675042627379298 6163524.57890140824019909, 1222169.40696959290653467 6163523.46221781242638826, 1222173.02387602441012859 6163522.93833232019096613, 1222181.3620401811785996 6163522.87232580408453941, 1222186.3841678025200963 6163522.57309107482433319, 1222193.23354228027164936 6163521.57254963368177414, 1222199.73341480130329728 6163521.21360162552446127, 1222201.68503075605258346 6163521.5380074568092823, 1222203.60651070484891534 6163522.33994087018072605, 1222216.60115142585709691 6163530.84118714183568954, 1222220.72262926283292472 6163532.89161071926355362, 1222235.96303470502607524 6163537.95717141777276993, 1222250.75713463011197746 6163541.78140769060701132, 1222264.18219967954792082 6163544.7253901120275259, 1222285.66344496258534491 6163548.22822136990725994, 1222289.65581843024119735 6163548.69451012182980776, 1222321.16909940494224429 6163548.56220758054405451, 1222327.2869421630166471 6163549.04326192568987608, 1222333.00733896205201745 6163549.93881141766905785, 1222343.7918584025464952 6163553.40718147996813059, 1222364.8782057729549706 6163561.09761787950992584, 1222374.38617295422591269 6163562.84488434251397848, 1222380.49361476302146912 6163563.14656621962785721, 1222383.54558576690033078 6163562.85638237558305264, 1222387.10042377607896924 6163561.86999288480728865, 1222391.12894339067861438 6163559.72742129396647215, 1222396.49691211106255651 6163556.36439375579357147, 1222422.91480307001620531 6163543.86841483134776354, 1222457.48939618561416864 6163528.0107396524399519, 1222473.97987110097892582 6163520.83264089375734329, 1222484.8118378845974803 6163516.51420341152697802, 1222490.94072532560676336 6163514.59653408546000719, 1222496.31510495604015887 6163513.62269852124154568, 1222505.23931001173332334 6163512.85643586050719023, 1222526.23967741336673498 6163513.18492048047482967, 1222550.52316098310984671 6163515.18410532176494598, 1222552.63112335535697639 6163515.10821861494332552, 1222562.87133586336858571 6163512.28258978296071291, 1222572.30563581013120711 6163511.24494870472699404, 1222583.42962442734278738 6163511.62692915927618742, 1222585.83371465234085917 6163511.16190670523792505, 1222593.99835460842587054 6163508.62688074167817831, 1222595.45539126126095653 6163508.72184437606483698, 1222596.9396282930392772 6163509.19849605206400156, 1222615.61654827347956598 6163516.49489424470812082, 1222626.11480901227332652 6163519.30447754915803671, 1222634.01497142459265888 6163519.43818367831408978, 1222641.1751130660995841 6163520.95218587294220924, 1222647.39087150478735566 6163522.99290098063647747, 1222649.03254117979668081 6163523.1585494065657258, 1222651.21234162664040923 6163522.90989050082862377, 1222671.48367698560468853 6163518.72662735730409622, 1222680.21152047300711274 6163515.98033052496612072, 1222685.58221826213411987 6163512.99724169727414846, 1222689.38989439862780273 6163509.89947698637843132, 1222694.39940347848460078 6163504.72193995025008917, 1222698.31214897753670812 6163501.14152287412434816, 1222700.61897078435868025 6163499.9320362638682127, 1222703.55354877817444503 6163499.35350307915359735, 1222707.53266382566653192 6163499.05998228024691343, 1222721.68977469578385353 6163499.28458498883992434, 1222738.28961216262541711 6163499.18130243010818958, 1222745.05828529829159379 6163499.90490034781396389, 1222759.30033414671197534 6163503.59656181093305349, 1222774.09627645323053002 6163504.520099937915802, 1222777.41414261795580387 6163505.24490172974765301, 1222788.30997023964300752 6163509.69405466131865978, 1222788.35574349761009216 6163503.7820191215723753, 1222789.26787974429316819 6163498.62061479222029448, 1222789.79377052886411548 6163496.50874735601246357, 1222790.36742016207426786 6163495.03694495093077421, 1222791.03187855705618858 6163493.77336644381284714, 1222792.11617196211591363 6163492.33152301516383886, 1222793.12702319957315922 6163491.18513261340558529, 1222802.48326202831231058 6163482.89756080787628889, 1222803.82385253277607262 6163481.25312322005629539, 1222806.57647983822971582 6163477.07706261333078146, 1222809.97718697320669889 6163472.66298699006438255, 1222814.77347937738522887 6163468.69615182094275951, 1222819.21148865297436714 6163466.09568500891327858, 1222820.68082707328721881 6163465.42152846325188875, 1222826.81413735775277019 6163462.83481200411915779, 1222829.69825853360816836 6163460.62265415210276842, 1222837.26713855681009591 6163453.87660650536417961, 1222844.34926258912310004 6163449.16745108645409346, 1222848.04147301614284515 6163447.29942312929779291, 1222866.99047155934385955 6163441.04576922208070755, 1222870.82929128129035234 6163439.92623665649443865, 1222872.37353113270364702 6163439.2469613291323185, 1222873.75777651253156364 6163437.98555088881403208, 1222874.95252954191528261 6163435.99452660419046879, 1222876.50446122838184237 6163431.52087376639246941, 1222877.28860250627622008 6163428.25782397482544184, 1222877.77302470756694674 6163425.42826701514422894, 1222878.58283845242112875 6163420.71625597309321165, 1222878.90536702005192637 6163418.76660079136490822, 1222879.41754881525412202 6163417.13358771335333586, 1222880.70879255025647581 6163414.45830582920461893, 1222883.41468082275241613 6163410.65800420008599758, 1222888.30935743590816855 6163406.49822540208697319, 1222892.00371178425848484 6163403.78209713939577341, 1222894.64300955971702933 6163401.5279207918792963, 1222897.22615368268452585 6163398.53254610858857632, 1222897.76483765011653304 6163397.90622704569250345, 1222899.54709237231872976 6163395.50489170756191015, 1222900.0666533641051501 6163394.91051786299794912, 1222901.44482107716612518 6163393.31376907881349325, 1222902.61318075703456998 6163392.36982790473848581, 1222903.77762379334308207 6163391.78274796716868877, 1222905.64308157819323242 6163391.43071856535971165, 1222908.2229523288551718 6163391.14739294722676277, 1222911.13348613190464675 6163390.86837779264897108, 1222912.54566366504877806 6163390.39054705668240786, 1222916.18195620854385197 6163387.88182749971747398, 1222920.59801257401704788 6163383.70584196504205465, 1222922.75651824311353266 6163381.25668460503220558, 1222923.32988013373687863 6163380.61083685327321291, 1222924.03818310098722577 6163379.83075509965419769, 1222925.78433626005426049 6163377.15863497275859118, 1222927.41635800013318658 6163376.51992636267095804, 1222929.16062462865374982 6163375.82328846957534552, 1222939.91560433874838054 6163371.54399973340332508, 1222945.64776265015825629 6163368.28884790185838938, 1222948.65316094527952373 6163366.42135583236813545, 1222951.85162195912562311 6163365.23912275023758411, 1222975.59926060098223388 6163361.57115415204316378, 1222995.56251882575452328 6163357.8328080028295517, 1222996.67104358272626996 6163357.63190100435167551, 1223006.48582963040098548 6163355.67675942089408636, 1223010.80391775374300778 6163354.36151360347867012, 1223016.33257625019177794 6163352.19504206907004118, 1223021.96830544411204755 6163350.13765664771199226, 1223036.31785664986819029 6163347.46519047953188419, 1223038.65880972123704851 6163348.40162503812462091, 1223040.21026116143912077 6163348.66055686771869659, 1223041.44891594862565398 6163348.6933767506852746, 1223043.56444773403927684 6163348.21642853412777185, 1223046.03101353743113577 6163347.38826094940304756, 1223049.38106266316026449 6163345.66019613109529018, 1223052.54502871190197766 6163343.05768297240138054, 1223055.20960554899647832 6163340.59352603740990162, 1223068.64451178861781955 6163333.66176531743258238, 1223072.72632119152694941 6163331.47922486811876297, 1223077.27212031558156013 6163328.26253735646605492, 1223095.47408593026921153 6163312.59571434836834669, 1223103.11440400127321482 6163308.2554716756567359, 1223123.94294632249511778 6163297.89013012498617172, 1223135.41814730502665043 6163292.17348844651132822, 1223148.42321119061671197 6163287.39495260082185268, 1223150.61777356988750398 6163286.65652118250727654, 1223153.22683724574744701 6163286.31983086653053761, 1223183.23834387632086873 6163286.26031206920742989, 1223186.55271855602040887 6163286.00279596913605928, 1223189.11584902950562537 6163285.51731462031602859, 1223195.73691319162026048 6163283.24926504027098417, 1223204.71342061180621386 6163279.78697629831731319, 1223212.07305200304836035 6163277.28825458884239197, 1223221.03616426233202219 6163276.55955386720597744, 1223225.30662934691645205 6163276.14471048396080732, 1223230.69726466480642557 6163274.87139146309345961, 1223249.40571010438725352 6163268.43194901756942272, 1223274.87254883162677288 6163263.46370281465351582, 1223282.36661844816990197 6163263.09663995262235403, 1223302.3314341816585511 6163263.04271844681352377, 1223305.03060016990639269 6163263.02588111534714699, 1223306.29572781664319336 6163263.03856006171554327, 1223313.08448382397182286 6163263.10580235347151756, 1223342.97633634135127068 6163266.07338404376059771, 1223361.58621915290132165 6163269.48221452720463276, 1223365.44683575187809765 6163269.7376189986243844, 1223377.98197009647265077 6163268.22554522100836039, 1223392.4387596242595464 6163266.589366240426898, 1223396.36143285362049937 6163265.7670923164114356, 1223405.3361109895631671 6163262.74023214913904667, 1223411.37192542850971222 6163260.03454523347318172, 1223425.45608469471335411 6163252.65352580230683088, 1223434.63982084766030312 6163247.444671630859375, 1223449.13609050982631743 6163238.54542493913322687, 1223449.83121753996238112 6163238.54566045571118593, 1223453.18548451433889568 6163238.61518675740808249, 1223454.30392277287319303 6163237.26548184733837843, 1223455.26505313185043633 6163236.32724466826766729, 1223456.11517568561248481 6163235.63678026385605335, 1223457.16531132441014051 6163234.92895670421421528, 1223458.47122660977765918 6163234.23059407249093056, 1223460.01021848362870514 6163233.61799506563693285, 1223462.27365026716142893 6163233.05264405068010092, 1223461.72164336778223515 6163229.7377115786075592, 1223471.19842614512890577 6163223.87165628746151924, 1223480.4926836034283042 6163219.14056826010346413, 1223500.10455880546942353 6163209.30330740939825773, 1223501.33606850844807923 6163209.93837728537619114, 1223501.68837753427214921 6163209.77703320328146219, 1223517.27688306197524071 6163202.21611990500241518, 1223520.61820107605308294 6163200.5991921853274107, 1223520.74779293104074895 6163200.53150880895555019, 1223521.67781397188082337 6163200.08189377002418041, 1223523.68527972465381026 6163196.2254930641502142, 1223535.78213020926341414 6163195.22529618721455336, 1223540.10931910458020866 6163195.33983425702899694, 1223545.57809434621594846 6163196.38357297331094742, 1223548.95777293061837554 6163196.34357707016170025, 1223552.46118388907052577 6163195.48758171405643225, 1223557.79957827995531261 6163194.55632597394287586, 1223563.83111801627092063 6163194.06047523207962513, 1223591.45514190755784512 6163191.70066580269485712, 1223595.93232533987611532 6163191.39202370587736368, 1223602.91932067042216659 6163189.82358910702168941, 1223607.35955264233052731 6163188.74180434830486774, 1223610.84119779337197542 6163187.43760829139500856, 1223614.99657527217641473 6163185.37293261289596558, 1223628.40685026859864593 6163176.29716552793979645, 1223651.29410731303505599 6163162.89596105460077524, 1223661.40338348993100226 6163157.09233130887150764, 1223666.99830132303759456 6163156.57251730654388666, 1223672.88836058485321701 6163155.57422820571810007, 1223700.18016183795407414 6163145.74307289067655802, 1223724.60701975831761956 6163137.26590960938483477, 1223756.15512943617068231 6163129.25168818514794111, 1223764.83342235162854195 6163126.82644543051719666, 1223772.29261476057581604 6163125.15117002557963133, 1223777.95075591187924147 6163123.64313743263483047, 1223780.67551700514741242 6163121.97659226879477501, 1223784.59566090325824916 6163119.03356984071433544, 1223798.7359649296849966 6163122.25047779269516468, 1223800.64173701964318752 6163122.42655510641634464, 1223833.90753417462110519 6163121.02519134618341923, 1223848.12493306770920753 6163081.64333830773830414, 1223857.02393846004270017 6163056.17545796371996403, 1223822.71562600648030639 6162981.4575215969234705, 1223822.88903105678036809 6162979.20563829783350229, 1223830.59299285616725683 6162923.9281876627355814, 1223838.41937141353264451 6162906.42146147042512894, 1223845.70252997661009431 6162891.23724390380084515, 1223850.88292274996638298 6162881.49783461540937424, 1223853.58021017210558057 6162876.48050380777567625, 1223857.66200831672176719 6162870.7154102623462677, 1223860.47074847994372249 6162866.77862107660621405, 1223862.68371749948710203 6162862.73795352317392826, 1223864.64289988693781197 6162858.25265535898506641, 1223867.20849127881228924 6162850.96998633816838264, 1223874.87938148807734251 6162834.87946263514459133, 1223876.49995276355184615 6162831.10353386029601097, 1223878.25496099190786481 6162825.25116570014506578, 1223879.8384079032111913 6162818.9608002407476306, 1223904.74608968012034893 6162803.31232790742069483, 1223908.33843667339533567 6162794.89586038049310446, 1223910.49438123730942607 6162789.51116435509175062, 1223914.05036952160298824 6162780.10951050464063883, 1223920.29681828594766557 6162767.00710268225520849, 1223924.90551636531017721 6162753.99578235484659672, 1223927.33443514979444444 6162746.86952255293726921, 1223937.55830831779167056 6162726.11841986700892448, 1223936.0582894841209054 6162725.01512323878705502, 1223937.13737729913555086 6162720.25790007039904594, 1223937.83595692506060004 6162718.47254461422562599, 1223945.77766947750933468 6162705.18324545212090015, 1223953.80322227510623634 6162693.93219846580177546, 1223956.10887054423801601 6162690.07761739287525415, 1223961.9453722529578954 6162682.46732725668698549, 1223971.33681618049740791 6162664.75210686307400465, 1223972.68523354781791568 6162662.60620034113526344, 1223977.64938408602029085 6162657.19140550121665001, 1223981.43130474607460201 6162652.78624475002288818, 1223985.18737549381330609 6162646.03498089127242565, 1223989.40791762038134038 6162642.24564400035887957, 1223996.06549513200297952 6162635.02533385530114174, 1224004.46902078483253717 6162627.91244613192975521, 1224005.34824319323524833 6162623.78649089299142361, 1224005.92982826940715313 6162623.04092483595013618, 1224022.10126626631245017 6162605.61619535740464926, 1224023.03683099965564907 6162604.58666247967630625, 1224032.20784000540152192 6162595.86144231539219618, 1224040.53884341148659587 6162589.03295008279383183, 1224045.88537545129656792 6162586.06004919949918985, 1224048.59766437136568129 6162583.32102377526462078, 1224049.23469836707226932 6162580.76052840612828732, 1224052.06597501575015485 6162577.06001259293407202, 1224052.98449034383520484 6162575.11382213607430458, 1224053.56950354343280196 6162572.58261423371732235, 1224053.93088088417425752 6162568.51532317325472832, 1224054.82008600467815995 6162566.52211901731789112, 1224056.86548929777927697 6162563.22698064055293798, 1224058.53850392578169703 6162561.77700492180883884, 1224059.57807480124756694 6162559.56153496261686087, 1224063.638050084002316 6162556.94244872219860554, 1224064.55886293272487819 6162555.27549479901790619, 1224065.09493655734695494 6162552.42780060041695833, 1224067.18257577810436487 6162550.35273274127393961, 1224069.27133738645352423 6162547.85359942354261875, 1224070.28053955966606736 6162545.50173510145395994, 1224070.9438753817230463 6162542.82058070693165064, 1224071.59321534936316311 6162542.15862149931490421, 1224072.78275431017391384 6162541.67420738469809294, 1224079.85318258544430137 6162541.44121889304369688, 1224081.5034958787728101 6162540.06754968967288733, 1224082.16325616929680109 6162537.43075534421950579, 1224082.66639802558347583 6162533.45303949993103743, 1224086.11720288335345685 6162533.664596744813025, 1224087.55927210301160812 6162530.66678962390869856, 1224089.49781522643752396 6162525.72221600171178579, 1224090.21481965715065598 6162521.24830886535346508, 1224092.31510041235014796 6162521.27272722683846951, 1224095.5083566615357995 6162516.57453543227165937, 1224098.80528464680537581 6162510.48945979308336973, 1224101.59780619386583567 6162505.42406322900205851, 1224104.62767662666738033 6162500.80198229197412729, 1224112.62265338143333793 6162490.65363883692771196, 1224115.4331385251134634 6162486.49381989985704422, 1224118.07681362773291767 6162482.55493959505110979, 1224122.5474065775051713 6162479.05374036729335785, 1224126.4876270666718483 6162476.89382366649806499, 1224130.67039938736706972 6162475.01021473109722137, 1224131.42787951068021357 6162474.54674366861581802, 1224132.00965680833905935 6162473.90165932383388281, 1224134.30083793145604432 6162468.38280786294490099, 1224134.43405323172919452 6162466.62996012810617685, 1224133.77381634921766818 6162461.27469981089234352, 1224131.98170563671737909 6162455.38159238453954458, 1224131.00286610866896808 6162451.82002568710595369, 1224129.81940341042354703 6162443.30832320638000965, 1224130.30952043551951647 6162440.82526596635580063, 1224117.93662372697144747 6162436.61186943575739861, 1224118.80936999712139368 6162433.48997475020587444, 1224119.8861940186470747 6162429.48042818531394005, 1224116.88955029100179672 6162424.9746477035805583, 1224118.2866202499717474 6162421.81694066803902388, 1224125.02287895232439041 6162406.75615701638162136, 1224136.9699405487626791 6162380.56328190956264734, 1224146.2872551498003304 6162360.08518203534185886, 1224162.21877324441447854 6162325.75312741007655859, 1224176.15072527155280113 6162294.25110182352364063, 1224190.62602796731516719 6162263.09432496223598719, 1224199.94849838316440582 6162242.24832092504948378, 1224201.07608954515308142 6162240.17390960361808538, 1224207.01629387005232275 6162227.49338328186422586, 1224207.68722697859629989 6162226.05183063447475433, 1224206.98866353183984756 6162221.27388244867324829, 1224187.05383570399135351 6162211.68371322937309742, 1224178.29924736311659217 6162208.57715178839862347, 1224170.09784745331853628 6162206.30776430945843458, 1224162.09636637894436717 6162205.04794881958514452, 1224135.53315035765990615 6162199.74490021169185638, 1224132.0060327434912324 6162199.34857857879251242, 1224111.11273062322288752 6162198.55512168817222118, 1224107.98962205043062568 6162198.28071891702711582, 1224101.15339007205329835 6162196.73552175611257553, 1224075.95278166700154543 6162185.72727408353239298, 1224064.09238910698331892 6162180.18244643975049257, 1224054.25428140093572438 6162176.03983764816075563, 1224043.18095863796770573 6162170.79297417122870684, 1224026.4819624365773052 6162162.43556529749184847, 1224017.4553956959862262 6162156.85163170471787453, 1224009.44334601238369942 6162153.15783202275633812, 1223982.0541665474884212 6162143.1118852561339736, 1223973.71778683713637292 6162139.43660887330770493, 1223935.56432151352055371 6162126.05547058302909136, 1223920.16368601471185684 6162122.22327663563191891, 1223907.95576872793026268 6162120.36766724195331335, 1223904.55336386570706964 6162120.2717477660626173, 1223901.14448459283448756 6162120.66643703263252974, 1223897.30257452372461557 6162121.60661646164953709, 1223891.41970742819830775 6162123.63228153344243765, 1223884.56755023007281125 6162127.82331806980073452, 1223880.22846365347504616 6162131.33530957344919443, 1223865.7164607485756278 6162145.01001925673335791, 1223858.32824140810407698 6162153.58915425464510918, 1223853.31136482069268823 6162160.88621433544903994, 1223848.93021694826893508 6162168.10061908792704344, 1223842.48522614315152168 6162176.88980413042008877, 1223837.42375852493569255 6162183.20101735834032297, 1223834.80584149388596416 6162185.59057482704520226, 1223828.76590628293342888 6162190.28250081092119217, 1223813.5536962125916034 6162201.45633277203887701, 1223798.6875794876832515 6162197.05362665187567472, 1223730.37756989547051489 6162178.38368698302656412, 1223716.17439889721572399 6162167.5048780320212245, 1223712.39278160315006971 6162165.2353684538975358, 1223703.77353188255801797 6162160.75629353802651167, 1223696.4086111115757376 6162154.68176866788417101, 1223676.10867135832086205 6162135.75468326266855001, 1223664.46241383138112724 6162126.4216440599411726, 1223657.00495222583413124 6162121.28847487922757864, 1223650.50097496155649424 6162118.0850970521569252, 1223650.05509549542330205 6162118.58491599466651678, 1223649.80267463508062065 6162118.4306162865832448, 1223631.62464188737794757 6162096.28155377879738808, 1223615.07977920188568532 6162078.71787583269178867, 1223583.32050686236470938 6162045.97498243395239115, 1223529.60649617109447718 6161989.66282032988965511, 1223470.99145385064184666 6161929.0382098201662302, 1223447.15821344987489283 6161904.02285820618271828, 1223440.37217920878902078 6161896.51077315025031567, 1223437.80332067515701056 6161894.08247043751180172, 1223347.20683265244588256 6161799.92908723931759596, 1223346.46059662313200533 6161799.53408750519156456, 1223337.31849863240495324 6161789.72262994572520256, 1223320.62971119862049818 6161772.79558200668543577, 1223310.84040110441856086 6161765.87877077888697386, 1223307.888106276281178 6161762.14716018363833427, 1223305.2979928171262145 6161757.10530015826225281, 1223298.95908278878778219 6161756.05870931502431631, 1223297.28730852715671062 6161748.55708755366504192, 1223286.69252956775017083 6161749.47808746062219143, 1223271.79757039272226393 6161752.60868838708847761, 1223267.48481046035885811 6161752.40627552196383476, 1223263.4182254783809185 6161751.92232862673699856, 1223254.28368240129202604 6161750.43843826744705439, 1223246.17446907726116478 6161749.58411384373903275, 1223243.4321100099477917 6161748.79388709738850594, 1223236.43166613997891545 6161741.16430401429533958, 1223235.61186606995761395 6161738.49751098826527596, 1223234.67072534188628197 6161731.89192824624478817, 1223220.83738455502316356 6161724.92847154289484024, 1223201.29388974700123072 6161713.53132723737508059, 1223191.56463923561386764 6161707.09951291605830193, 1223166.6896414952352643 6161689.54685468506067991, 1223145.12043370842002332 6161672.67360269650816917, 1223138.29740360332652926 6161666.74388496670871973, 1223133.47428651480004191 6161661.77894304972141981, 1223126.83476721844635904 6161654.29018529690802097, 1223103.14811953925527632 6161623.3163890577852726, 1223089.8768907506018877 6161607.21222171187400818, 1223076.45595577824860811 6161591.21880783047527075, 1223074.85365633014589548 6161590.86650781333446503, 1223074.78672359534539282 6161589.74492070265114307, 1223074.34612918365746737 6161588.63788406737148762, 1223073.76322833774611354 6161587.65332862176001072, 1223072.66666948609054089 6161586.57158993743360043, 1223067.25967669091187418 6161584.12692403513938189, 1223057.3690110775642097 6161580.51746600121259689, 1223046.72565198084339499 6161577.00366717483848333, 1223042.03236249927431345 6161574.63883091695606709, 1223038.29348263470456004 6161573.06523269694298506, 1223035.71954026445746422 6161571.00504008587449789, 1223034.01530786277726293 6161570.37665394600480795, 1223018.62266692123375833 6161559.43720267247408628, 1222975.9402892105281353 6161531.62209072522819042, 1222946.22677338914945722 6161511.9506016606464982, 1222913.16785837220959365 6161491.16157607920467854, 1222894.57425141241401434 6161481.21483664494007826, 1222866.07012142310850322 6161470.88320733979344368, 1222848.38551729265600443 6161466.30055391043424606, 1222828.50378772104158998 6161463.818099245429039, 1222787.74978149891830981 6161463.65139749087393284, 1222738.21969573688693345 6161464.33003175724297762, 1222713.65258426731452346 6161466.1699557863175869, 1222672.69788140221498907 6161470.45258061774075031, 1222653.48837593314237893 6161470.72594776283949614, 1222639.42258586408570409 6161469.41534698847681284, 1222606.95334751065820456 6161462.27061797957867384, 1222591.60187961743213236 6161457.08393715601414442, 1222534.76536257332190871 6161436.95081258099526167, 1222500.47918866528198123 6161401.25342679768800735, 1222475.02067727549001575 6161376.412598617374897, 1222450.27218732936307788 6161356.42861322779208422, 1222426.38674681610427797 6161352.77702770195901394, 1222396.4019301135558635 6161350.7556545240804553, 1222387.50126729905605316 6161349.13558170106261969, 1222374.64552000630646944 6161342.04059944115579128, 1222343.15234377607703209 6161318.63477491121739149, 1222325.49017467349767685 6161304.40037837345153093, 1222315.91394405625760555 6161292.85897684004157782, 1222302.91710554924793541 6161274.03283204417675734, 1222281.09475153032690287 6161238.95962591748684645, 1222262.12729179346933961 6161206.5716970432549715, 1222252.52749804267659783 6161192.02596120815724134, 1222244.35431084199808538 6161182.28294839058071375, 1222230.28375833830796182 6161168.25952672213315964, 1222218.54562807688489556 6161158.27453484199941158, 1222200.28313990752212703 6161146.24694435577839613, 1222157.11810769024305046 6161122.71844348404556513, 1222149.48266446404159069 6161112.87369342148303986, 1222140.30831569223664701 6161095.60536772385239601, 1222133.40577099821530282 6161079.42381110973656178, 1222123.42989882500842214 6161064.40157988201826811, 1222103.98339151358231902 6161038.05887850373983383, 1222101.27797500556334853 6161029.90529985912144184, 1222096.95990289654582739 6161002.1669419864192605, 1222093.17552717053331435 6160988.17826590035110712, 1222088.31512766191735864 6160979.85145682469010353, 1222081.65166267775930464 6160971.40203314460813999, 1222074.48945603449828923 6160966.68523806612938643, 1222059.90599950845353305 6160960.67986856680363417, 1221949.55931735248304904 6160921.94168316572904587, 1221914.16772219375707209 6160906.41764932777732611, 1221908.76035241829231381 6160911.89889739081263542, 1221898.90868846280500293 6160920.14833199325948954, 1221881.27687369659543037 6160926.32155888061970472, 1221874.12928699352778494 6160929.76548199914395809, 1221870.88997696014121175 6160931.96079692337661982, 1221858.39542704564519227 6160939.10497206822037697, 1221851.86817322624847293 6160942.04066977929323912, 1221854.02236143220216036 6160954.38016717508435249, 1221856.37258439208380878 6160964.27981177251785994, 1221868.55130936671048403 6161008.45472280401736498, 1221871.28189899935387075 6161016.71066042128950357, 1221842.9349606076721102 6161077.94372494798153639, 1221827.31526329391635954 6161084.83681585267186165, 1221809.67023351159878075 6161093.24148600175976753, 1221785.19112321268767118 6161106.34318220801651478, 1221773.26728279632516205 6161146.57228878699243069, 1221770.05759202386252582 6161158.7040274990722537, 1221741.37181971129029989 6161165.44051280245184898, 1221737.04150546947494149 6161165.76233999524265528, 1221732.7468293160200119 6161165.64056089613586664, 1221727.58307045511901379 6161165.00249524973332882, 1221724.67768843518570065 6161165.10394779499620199, 1221722.09238648181781173 6161165.34273280203342438, 1221716.82378337881527841 6161167.04023131914436817, 1221710.17893125885166228 6161170.4130649957805872, 1221694.93934050993993878 6161179.79256580211222172, 1221684.86623432580381632 6161186.68504333961755037, 1221677.20421326789073646 6161184.17206749878823757, 1221666.84523297753185034 6161185.34909357130527496, 1221660.71172570530325174 6161188.53981138858944178, 1221653.77564588095992804 6161192.44739418383687735, 1221649.36933642835356295 6161195.77685600891709328, 1221645.72997668827883899 6161198.83306977897882462, 1221640.80554737197235227 6161203.46033423952758312, 1221638.14217226160690188 6161205.70200120285153389, 1221633.22458369680680335 6161209.21364125609397888, 1221608.77091469103470445 6161220.9785479512065649, 1221606.25839803786948323 6161225.46469826530665159, 1221605.06657277420163155 6161227.93618420138955116, 1221604.39911078591831028 6161232.12405178509652615, 1221603.77560219285078347 6161239.88721807859838009, 1221602.696326260920614 6161248.17186708655208349, 1221602.13458483805879951 6161259.28852563351392746, 1221602.2319801626726985 6161262.19840951636433601, 1221604.44878273457288742 6161273.76146525517106056, 1221597.26849565282464027 6161292.04816636815667152, 1221594.53807249013334513 6161297.18652037065476179, 1221585.07959827338345349 6161313.95084068179130554, 1221579.57008837093599141 6161322.77261646091938019, 1221573.22032648045569658 6161331.75014727748930454, 1221570.40768946288153529 6161336.88190380018204451, 1221560.98343031434342265 6161348.0681109307333827, 1221547.31071736454032362 6161363.7126928549259901, 1221535.67900943895801902 6161376.61914737429469824, 1221532.56206896086223423 6161380.38706582598388195, 1221530.42107085464522243 6161383.34041980933398008, 1221514.84324634238146245 6161401.0644089300185442, 1221489.30138188763521612 6161429.48489224165678024, 1221472.29890725505538285 6161447.42940821405500174, 1221457.47466718498617411 6161467.11150491889566183, 122)

This is probably not a viable method to deduce geometry and I think it creates overlaps. Besides, the cells look incomplete. I will try to create an adjacency matrix between all localities so you dont have to worry about geometry. I'm off for now though.

JamesBremner commented 1 year ago

. I will try to create an adjacency matrix between all localities so you dont have to worry about geometry.

Well, that would be great!

remidelattre commented 1 year ago

alright, here's a csv with every adjacency. I used geopandas with py : avoids creating 348XX rows. insee = id for each locality insee_adjacent = adjacent localities (polygons touch each other) de = don't need to explain at this point lol if you have issues with the fact adjacencies are listed in the same row, please use column wizard with excel. You'll have empty cells, obviously, but one row will be one id. https://file.io/h1QHqEMJfRx6

JamesBremner commented 1 year ago
insee;insee_adjacent;de
01001;01146 01093 01188 01028 01412 01351;2,6759
01002;01056 01007 01363 01384 01199 01277;0,298986426
01004;01041 01089 01149 01384 01345 01421 01007;-11,28376052
01005;01446 01382 01207 01362 01261 01398 01318;0,58988963
...

That looks perfect.

I will code a reader for this tommorrow.

JamesBremner commented 1 year ago

Guessing that

2,6759

means the number 2.6759 for de

remidelattre commented 1 year ago

yes, exactly, this is "de". my plan for later is to run the algorithm with these values, store the results, then edit values for "de", run it again, and compare the two iterations.

remidelattre commented 1 year ago

newrow adjacency_matrix2.csv Added a row containing regional codes. This allows us to create .csv filles contaning a smaller amount of localities while running the tests within perimeters that are actually relevant to the problem the algorithm intends to solve. If you watched my video, you can see me reducing the sample at the start. This the same principle, except with actually relevant perimeters.

We may need to adress the fact that, using this method, we're asking the program to consider adjacencies that won't exist in the smaller files. Perhaps this would require ignoring localities that the program can't find or cleaning the .csv beforehand. potentialissue

edit 21:25 : we could also run it on the whole dataset but i cant begin to fathom the computational work required if we consider 4 degrees of adjacency. besides i think it'd be difficult to actually visualize..

JamesBremner commented 1 year ago

Added a row containing regional codes. This allows us to create .csv filles contaning a smaller amount of localities

This is a good idea.

However, it seems that you have added a new column, not a row.

image

Also, you keep calling and labelling these files as CSV, which stands for Comma Separated Values. In reality, your files use a mixture of semicolons and spaces to separate values while commas are used to represent the decimal point in floating point numbers.

Oh, and one more small point - it is not an adjacency matrix ( which is rectangular ) but an adjacency list.

JamesBremner commented 1 year ago

We may need to address the fact that ...

OK. I will open a new issue for this.

JamesBremner commented 1 year ago

Here is the output from a run of the grouper app using your text file as input

grouper_run_23102023.zip

For a quick look, here are the first few groups output

finished reading 34816 localities
============
01001 2.6759 01146 -1.502172811 01093 -7.803621135 01188 0.524909037 01028 2.06945 01412 1.755569314 01351 2.80125551 sum 0.52129
============
01002 0.298986426 01007 -4.306125486 01363 2.327430935 01199 1.232900641 sum -0.446807
============
01277 1.226157194 01012 -0.42692219 sum 0.799235
============
01004 -11.28376052 01041 1.3825 01089 6.279176383 01149 1.483930913 01345 1.113282445 01421 0.989867419 sum -0.0350034
============
01005 0.58988963 01362 -1.861710215 01261 2.3468 01318 -0.928684912 sum 0.146295
============
01446 0.326165044 01382 2.48955 01021 -1.949675381 sum 0.86604
============
JamesBremner commented 1 year ago

Released v0.0.3

The input file will be placed in the correct folder when you pull the latest version of this repository

image