pelias / whosonfirst

Importer for Who's on First gazetteer
MIT License
28 stars 42 forks source link

fix(sqlite): remove obsolete index creation #431

Closed orangejulius closed 5 years ago

orangejulius commented 5 years ago

Turns out the obsolete index creation is not compatible with the PIP service.

In the case where the sqlite database does not yet have the obsolete index, all PIP workers will attempt to create it, and most will fail to obtain a write lock.

This causes the workers, and thus the whole PIP service, to shut down without updating the SQLite db. Therefore no progress can be made and the PIP service can never start.

It's likely this wasn't caught during development because the SQLite database used for testing already had the index. However, the databases downloaded from dist.whosonfirst.org do not yet have it.

While this index is important for performance when reading from the SQLite database, it will eventually make its way into the official files.

In the meantime, the dataHost parameter in pelias.json can be used in combination with a custom SQLite db with the index already created, if max performance is desired.

missinglink commented 5 years ago

Oh interesting, I didn't consider a case where the disk is not writable.

I tested it on databases with and without the index, in the case where it doesn't previously exist it gets created, and can block for ~20s while it creates it.

I think it's best to leave this code as-is for the majority of users, once the dist files contain the index no cold start penalty will apply.

Is there a reason why we fail to obtain a 'write lock'?

missinglink commented 5 years ago

Another option is to check the file is writable before trying to open the DB in rw mode.

orangejulius commented 5 years ago

To be clear, the disk is writable.

The way I understand it, it's a race condition between the different PIP worker processes. One of the workers will claim the lock, and the others will fail to claim it and then shut down. When any one PIP worker shuts down, the main process shuts itself and all the other workers down.

Another solution might be to wrap the obsolete index creation line in a try/catch block. It would fail for most workers but then they could continue on loading data. They might pay the penalty of not having an index, however.

missinglink commented 5 years ago

I see, this was intended for dev at this stage.

It was designed as a perf fix for a single writer environment and won't work in a cluster environment.

It's a tricky PR to merge because I believe we should leave it as-is for now and merge this once WOF starts generating dist files with the index included.

I wasn't intending 'sqlite: true' to be used in production environments yet.

orangejulius commented 5 years ago

Yeah, it's unfortunate because the performance is really helped by that index. I think we can solve it though.

The PIP service has multiple workers in any environment, including running locally. In fact, while I haven't confirmed it, it's possible that someone could hit this race condition by running any of the non-WOF importers, without having run the WOF importer first.

We might be able to circumvent this by having the WOF downloader create the obsolete index after it's done downloading the database. This would always be a single process environment with no possibility for a race condition, and because the WOF downloader is used everywhere, including the PIP service, it would solve the issue everywhere.

jdomag commented 5 years ago

@orangejulius @missinglink Can you share a details on how to run wof importer against local sqlite so the obsolete is created and so i can point PIP service to this sqlite instance via datahost?

orangejulius commented 5 years ago

While we want to remove this index creation eventually, https://github.com/pelias/whosonfirst/pull/458 allows us to keep the performance benefits while waiting for the index to be present in all data distributed by WOF.