osmlab / field-data-coordinator

7 stars 2 forks source link

Empty export #127

Open dereklieu opened 7 years ago

dereklieu commented 7 years ago

The problem: currently all exports are empty.

One possible issue: all observations are tagged with type: observation.

observe-export filters out all non-node objects so it can tie the latest observations to a specific node. When there are no objects of type: node to be found, the result is empty.

cc @sethvincent @kamicut could this be the source of the problem?

dereklieu commented 7 years ago

I did some more tests and confirmed that there's OSM data in the osmDB when I'm trying to export. I also made sure to create observations from the synced OSM data, so in theory there should be links between observations and OSM nodes.

Still, the export returns empty because it can't find any nodes to associate the observations with.

When logging this output from observe-export, I noticed that the obs and link properties in the data returned from the osm observations index were identical. This might explain why we can't get any nodes. IE:

[{"key":"1308777074327544859","value":{"type":"observation-link","obs":"16864331653011895953","link":"16864331653011895953"}}]

At this point I'm a little lost as to where to look next. cc @sethvincent @mojodna

mojodna commented 7 years ago

@dereklieu perhaps somewhere in https://github.com/digidem/osm-p2p-observations or in the code that passes observations into it?

@noffle is more familiar with these parts.

hackergrrl commented 7 years ago

If obs and link are the same, then seeing an empty export makes sense. obs should point to the observation document's ID, and link to the ID of the node being observed.

@dereklieu could you upload the LevelDBs of your osmOrgDb and osmObsDbs?

hackergrrl commented 7 years ago

@dereklieu Does this always happen? If you created a fresh osm-p2p database and created a single observation, does its link and obs still match?

dereklieu commented 7 years ago

@noffle yep, I've found this behavior to be very consistent. I also had @felskia recreate it on her machine.

I've uploaded copies of my current db's here.

hackergrrl commented 7 years ago

@sethvincent How are nodes created on the collector app? There is a function createNode in app/lib/osm-p2p.js, but I don't see it being called anywhere in the collector codebase.

dereklieu commented 7 years ago

Hey @sethvincent revisiting this. Is the createNode function getting called? Wondering if this helps shed some light on why observations don't link to nodes.

sethvincent commented 7 years ago

@dereklieu it looks like that might be it! I don't think it is being called.

dereklieu commented 7 years ago

@sethvincent cool, what would be the lift of using this method instead of what's currently being called?

hackergrrl commented 7 years ago

@dereklieu I think the problem is that whatever is currently being called (I couldn't figure out what) isn't setting up the node/observation relationship correctly.

sethvincent commented 7 years ago

@noffle Could you take another look at this? I'm currently on another project and could use some help debugging this. I think createObservation here is the thing to look at: https://github.com/osmlab/field-data-collection/blob/master/app/lib/osm-p2p.js#L67

That and the initializeObservation, updateObservation, and saveObservation action function calls might be good to look at.

I'm realizing the problem isn't just that createNode isn't called, because that doesn't explain all the osm nodes in the db from the osm import.

hackergrrl commented 7 years ago

On debugging

I haven't been able to get Android+RN working again on my laptop. I have a repro request:

  1. Add console.log statements at app/screens/OsmFeature/view.js:111 and app/screens/Observation/choose-point.js:244 to log the observation params being sent.
  2. Add a console.log statement at app/lib/osm-p2p.js:93 to log the observation-link object being created.
  3. Start with a fresh db (osmOrg and osmObservations)
  4. Create a new observation

What output was logged?

On createNode not being called

There seems to be two paths for new observations. app/screens/OsmFeature/view.js seems to be for observing an existing node. Is app/screens/Observation/choose-point.js for creating a new observation + node? If so, this code path needs to hit app/lib/osm-p2p.js:50.

On the use of the 'osm-p2p-id' tag on observations

I noticed that 'osm-p2p-id' tag is being set on observation documents in order to store the ID of the node that is being observed. That isn't what that tag is for (it's for node documents in the osmObservations database to reference their original osm-p2p ID after they get uploaded to OSM.org). Using a tag for this is fine if it makes the app logic easier to write, but it should be cleared before being written to osm-p2p-db. Adding something like delete doc.tags["osm-p2p-id"]; to app/lib/osm-p2p.js:75 should be fine.

sethvincent commented 7 years ago

Here's an update on this: https://github.com/osmlab/field-data-collection/pull/321

My current (mostly untested) hunch is back to the idea that something is off with observe-export and/or something about the link documents stored by the desktop app.

Here's what links now looks like in onObservationLinks when I do an export:

[ { key: '719628114142833931',
    value: 
     { type: 'observation-link',
       obs: '18223479728498982007',
       link: '-7141998533625156-1511393221751-1' } } ]

This looks like the expected values compared to what we had here: https://github.com/osmlab/field-data-coordinator/issues/127#issuecomment-337998558

dereklieu commented 7 years ago

@sethvincent Sweet. I can't comment too much on the state of the linked pr unfortunately, but am happy to take another look at how things are handled in the desktop app.

One question, the links indeed look right there, but is that a direct result of osmlab/field-data-collection#321 or were they working OK before that?

sethvincent commented 6 years ago

After another round of looking into this I found two issues:

New nodes not synced

When the mobile app syncs with the desktop app, the OSM nodes are only sent from the desktop app to the mobile app, so any nodes created on the phone are not sent back to the desktop app. observations linked to those nodes will not be exported.

Imported OSM nodes do not have the same keys on desktop and mobile

I made an issue for this here: https://github.com/digidem/osm-p2p-db/issues/59

Short version: because the osm.create function creates a random key, when we're importing from osm xml data, the same node will have a different key on desktop and mobile. When observe-export looks for the node key recorded in the observation-link index, it's not found.

hackergrrl commented 6 years ago

@sethvincent re sync: woah, that's very strange! These are documents in the osmObs database? When hyperlogs sync, all documents are exchanged. Or are the nodes being created in the osmOrg db instead? That could explain the observation data not appearing in the export.

felskia commented 6 years ago

@noffle do you have any capacity to address the two issues @sethvincent found and if so, when. Please, let me know.

Thanks!

hackergrrl commented 6 years ago

Hey @felskia. Thank you for saying something about this! I think we've maybe gotten into a state where each of us was thinking the other person was addressing these issues? :fearful:

New nodes not synced: new nodes should be written to the observation database, not the osmOrg one. This makes them sync properly back to the desktop app.

Imported OSM keys don't have same keys: https://github.com/digidem/osm-p2p-db/issues/59 was filed under the osm-p2p-db repo, but this is actually a field-data-{collector,coordinator} bug. This is happening because we decided to send raw OSM XML over the wire on an OSM.org import to the mobile clients instead of osm-p2p-db's replication facilities, because of I think speed concerns? This might not be true anymore with LevelDB as a working storage backend. Checking whether LevelDB makes things faster (thus letting us use osm-p2p-db's replication mechanism) would be a good next step. I think @gmaclennan sent out two emails to y'all on this (entitled Re: Field data performance sent on Tue, 28 Nov 2017 12:33:38 -0500 and Tue, 28 Nov 2017 12:42:25 -0500), where some options were laid out to DevSeed re perf options. Let me know if you can find them and we can decide how to proceed on that front.