security-force-monitor / sfm-cms

Platform for sharing complex information about security forces. Powers WhoWasInCommand.com
https://whowasincommand.com
10 stars 3 forks source link

Organization: Name sources not correctly connected to data point #343

Closed tlongers closed 6 years ago

tlongers commented 7 years ago

@jeancochrane I think there is an issue with source displays:

https://whowasincommand.com/en/organization/detail/212/

On WWIC, the source for the name Operación Conjunta Chihuahua is listed as:

Title: “Currículum Vitae” Publication: Gobierno del Estado de Guerrero. Access Published on: 2017-06-23 Source URL: http://administracion2014-2015.guerrero.gob.mx/directorio/406606-2/ Archive URL: https://web.archive.org/web/20170623175145/http://administracion2014-2015.guerrero.gob.mx/directorio/406606-2/ Date added: 2017-10-28 03:28:51.003128+00:00 Date updated: 2017-10-28 03:28:51.003105+00:00 Added by: importer

However, In the import data source, the sources for the name of this record are as follows:

"Ciudad Juárez, Chih., a 29 de abril del 2009.". Secretaría de la Defensa Nacional (Mexico). 29 April 2009. http://www.sedena.gob.mx/servlet/Satellite?c=articulo_C&cid=1393962404969&d=Desktop&pagename=sedena%2Farticulo_C%2FSArticuloSinPaginaLayout Internet Archive link: https://web.archive.org/web/20150816005221/http://www.sedena.gob.mx/servlet/Satellite?c=articulo_C&cid=1393962404969&d=Desktop&pagename=sedena%2Farticulo_C%2FSArticuloSinPaginaLayout “Lanzan operación conjunta anticrimen en Chihuahua”. El Universal (Mexico). 28 March 2008. http://archivo.eluniversal.com.mx/nacion/158448.html Internet Archive link: http://web.archive.org/web/20170525170054/http://archivo.eluniversal.com.mx/nacion/158448.html “Desaparición forzada, uno de los saldos perversos de la Operación Chihuahua”. Proceso (México). 5 January 2017. http://www.proceso.com.mx/468757/desaparicion-forzada-uno-los-saldos-perversos-la-operacion-chihuahua Internet Archive link: http://web.archive.org/web/20170525182252/http://www.proceso.com.mx/468757/desaparicion-forzada-uno-los-saldos-perversos-la-operacion-chihuahua Se suman más de mil trescientos militares a los trabajos de la OCCH. La Red Noticias (Mexico). 21 April 2009. http://www.larednoticias.com/noticias.cfm?n=24906 Internet Archive link: https://web.archive.org/web/20170625035320/http://www.larednoticias.com/noticias.cfm?n=24906 Incautan uniformes y droga en Juárez. El Norte (Mexico). 22 December 2009. (accessed via Lexis) "Personal militar asegura enervante, vehículos y armamento". Secretaría de la Defensa Nacional (Mexico). 29 April 2009. https://www.gob.mx/sedena/prensa/personal-militar-asegura-enervante-vehiculos-y-armamento Internet Archive link: https://web.archive.org/web/20170709001018/https://www.gob.mx/sedena/prensa/personal-militar-asegura-enervante-vehiculos-y-armamento “PERSONAL MILITAR DESMANTELA BANDA JUVENIL QUE SE DEDICABA A ROBAR AUTOS CON LUJO DE VIOLENCIA”. Secretaría de Defensa Nacional (México). 15 January 2010. http://www.sedena.gob.mx/pdf/comunicados/2010/mandos/ciudadjuarez_15ene.pdf Internet Archive link: https://web.archive.org/web/20101221061110/http://www.sedena.gob.mx/pdf/comunicados/2010/mandos/ciudadjuarez_15ene.pdf “Declara milicia estado de guerra en Parral”. El Ágora (Mexico). 15 May 2008. http://www.elagora.com.mx/Declara-milicia-estado-de-guerra,4802.html Internet Archive link: http://web.archive.org/web/20170712171503/http://www.elagora.com.mx/Declara-milicia-estado-de-guerra,4802.html Relevan a 3 mil militares del Operativo Chihuahua. El Universal (Mexico). 30 October 2009. http://archivo.eluniversal.com.mx/notas/636938.html Internet Archive link: http://web.archive.org/web/20170204223315/http://archivo.eluniversal.com.mx/notas/636938.html “Personal militar asegura cuatro civiles un vehículo y enervante”. Secretaría de Defensa Nacional (México). 13 January 2010. http://www.sedena.gob.mx/pdf/comunicados/2010/mandos/opn_conjunta_13ene.pdf Internet Archive link: http://web.archive.org/web/20170622180643/http://www.sedena.gob.mx/pdf/comunicados/2010/mandos/opn_conjunta_13ene.pdf “PERSONAL MILITAR DESMANTELA BANDA JUVENIL QUE SE DEDICABA A ROBAR AUTOS CON LUJO DE VIOLENCIA”. Secretaría de Defensa Nacional (México). 15 January 2010. http://www.sedena.gob.mx/pdf/comunicados/2010/mandos/ciudadjuarez_15ene.pdf Internet Archive link: https://web.archive.org/web/20101221061110/http://www.sedena.gob.mx/pdf/comunicados/2010/mandos/ciudadjuarez_15ene.pdf “Recomendación No.063 sobre el caso del señor Rubén Coxcahua Marín”. Comisión Nacional de los Derechos Humanos (Mexico). 6 October 2009. http://www.cndh.org.mx/sites/all/doc/Recomendaciones/2009/Rec_2009_063.pdf Internet Archive link: https://web.archive.org/web/20170630205402/http://www.cndh.org.mx/sites/all/doc/Recomendaciones/2009/Rec_2009_063.pdf

jeancochrane commented 7 years ago

Hmm, strange. The parsed source seems clean, so I'm betting that we're simply pulling from the wrong column for OrganizationNames. I'll take a look.

jeancochrane commented 6 years ago

Fixed the typo that was causing us to overwrite new sources for OrganizationName in https://github.com/security-force-monitor/sfm-cms/commit/8079c81ea0771f39b4c08330d49ebabe2e55156b.

Do we want to re-run the importer to fix this @tlongers?

tlongers commented 6 years ago

Is it possible to run it without disruption to the production site?

tlongers commented 6 years ago

Perhaps examine #348 before running?

jeancochrane commented 6 years ago

Good call. Let's wait on #348 at the very least.

We could run the importer without disrupting the production site by running it in a secondary cloned copy of the database, and then switching the connection over once it's done. That way we can also quickly and easily switch it back to the original database if we notice that something unexpected happened during the import, such as permalinks changing (which I don't anticipate, but which is a possibility to keep in mind). The only real downside is time, in that it'll probably take ~an hour to put together the code for that solution, and then the import itself takes a few hours.

tlongers commented 6 years ago

I'm not clear on the extent of disruption that the importer causes to the production site. Does the importer do a lot of stuff in the background for a few hours, and then shut down WWIC for a few minutes whilst it dumps loads of data into its db. Or, is it a long period of continuous writing and service disruption?

If the latter, I think it's worth the work to have a secondary clone of the database so we can smoothly manage updates.

jeancochrane commented 6 years ago

We can do it both ways, but right now the importer is set up to operate on the existing database, which means prolonged outage. I like the idea of operating on a secondary clone and then switching the connection over, but it'll take a couple hours of work to set it up in a reproducible way.

tlongers commented 6 years ago

+1 For long term sanity, I think it's worth setting up a secondary clone. Go for it.

jeancochrane commented 6 years ago

Awesome. This round of importer bugs are now all fixed, so I'm going to take some time tomorrow to set up the new import strategy.

tlongers commented 6 years ago

How's this coming on?

jeancochrane commented 6 years ago

Very close! The new pipeline is deployed to production, but I haven't run it yet because I was seeing some strange integrity errors when I ran it locally and on staging last Friday. Should be deployed by mid-week.

tlongers commented 6 years ago

Thanks for the update Jean, appreciated.

jeancochrane commented 6 years ago

Hmmm, strangely enough, the importer is done running on production but I'm still seeing this issue for Operación Conjunta Chihuahua. Investigating now.

tlongers commented 6 years ago

Got it.

jeancochrane commented 6 years ago

I've finally got a fix ready for this! https://github.com/security-force-monitor/sfm-cms/commit/94488ef7efcc5620d9a513bcaacd3affe1fae04a makes everything work locally.

Unfortunately the import is now having memory problems on the server (not enough disk space to hold temporary files it makes while it's building the database). @tlongers are you OK if I bump up the sizes for staging and production server disk space? Back-of-the-envelope, you should be looking at an increase of $3/mo, plus about an hour of work to set it up.

tlongers commented 6 years ago

That's fine.

tlongers commented 6 years ago

Closing.