phetsims / perennial

Maintenance tools that won't change with different versions of chipper checked out
MIT License
2 stars 5 forks source link

Build server should NOT make SQL queries to the website production database! #11

Closed jonathanolson closed 6 years ago

jonathanolson commented 8 years ago

It looks like addTranslator() was added in https://github.com/phetsims/perennial/commit/5ac71c3601835d6b0a8163cbbc80d2b669a2fb07 as part of https://github.com/phetsims/rosetta/issues/83 after I code-reviewed the build server. It apparently now makes a direct DB connection to the production website server.

This can:

  1. Cause random failures to insert translators.
  2. Cause random failures on the website side with things that deal with translators and sim translations, and has the potential to cause website stability issues.
  3. Cause errors similar to what @samreid and I experienced when we were trying to remove an HTML sim translation, as the translation appeared in some places, but didn't appear in the website's administration interface (totally explains the weirdness I experienced earlier).

@ariel-phet, I'm tagging this as high priority due to the above, and the fact that I see this has errored out 149 times in the build server log. Feel free to reprioritize (assigned to you).

Reasons:

  1. The website uses an in-memory cache in front of the DB. If you query the DB for information, it can be out of date, and more seriously, if you insert information into the DB, it can (a) conflict with the in-memory representation and possibly trigger errors the website tries to flush the cache to the DB, and (b) the website won't realize for a certain amount of time (possibly indefinitely?) that the information has been updated.
  2. Due to this, the website essentially "owns" the database. Modifications to it should go through Hibernate and its cache. This should only be bypassed by manual fixes by a website developer (in the past, I've had to flush the cache to the DB, make the change, and clear the cache, but build-server isn't doing that, and shouldn't). Removing the caching almost certainly not a good solution due to performance reasons.
  3. E.g. the following error in build-server's logs (since things got out-of-sync):
Feb 04 2016 07:41:34 MST - info: running SQL command: INSERT INTO user_localized_simulation_mapping SELECT localized_simulation.id, phet_user.id from localized_simulation, simulation, project, phet_user WHERE simulation.name =
 'ph-scale-basics' AND locale = 'vi' AND phet_user.id = 137535 AND simulation = simulation.id AND simulation.project = project.id AND project.type = 2
Feb 04 2016 07:41:34 MST - error: adding to user_localized_simulation_mapping, translator probably already exists for this sim and locale
Feb 04 2016 07:41:34 MST - error:  error: duplicate key value violates unique constraint "user_localized_simulation_mapping_pkey"
    at Connection.parseE (/data/share/phet/phet-repos/perennial/node_modules/pg/lib/connection.js:539:11)
    at Connection.parseMessage (/data/share/phet/phet-repos/perennial/node_modules/pg/lib/connection.js:366:17)
    at Socket.<anonymous> (/data/share/phet/phet-repos/perennial/node_modules/pg/lib/connection.js:105:22)
    at Socket.emit (events.js:95:17)
    at Socket.<anonymous> (_stream_readable.js:764:14)
    at Socket.emit (events.js:92:17)
    at emitReadable_ (_stream_readable.js:426:10)
    at emitReadable (_stream_readable.js:422:5)
    at readableAddChunk (_stream_readable.js:165:9)
    at Socket.Readable.push (_stream_readable.js:127:10)

I'd like to coordinate with @jbphet (for the build-server/rosetta parts) and @mattpen (for the website) so we can set up a way where the build-server/rosetta can send an (authenticated) request to the website that will cause a translation entry to be created.

It also looks like Rosetta is requiring pg-query and setting it up so it can make queries, but it never actually does so. This should also be removed.

jonathanolson commented 8 years ago

https://github.com/phetsims/website/blob/master/src/java/edu/colorado/phet/website/data/LocalizedSimulation.hbm.xml shows that Hibernate is responsible for this table.

Looks like Rosetta (and not Hibernate) "own" certain tables (@phet-steele pointed me to https://github.com/phetsims/rosetta/commit/ffdd4ca7b508939223d7e9c95ec613d622452e16), which is an interaction I wasn't aware of. It doesn't seem to have foreign keys to the other tables, so it won't break the website (that's good).

It may be helpful for me to review Rosetta some with @jbphet, to document how they are linked. I was unaware that rolling back the website's production database would also roll back rosetta.

jbphet commented 8 years ago

The interaction that Rosetta has the the DB is for saving non-submitted user strings. This was all done by @aaronsamuel137, so I'm no expert on it, but that's the big picture. As part of our discussion, I'd also like to determine whether this is okay or should be done differently, and whether this would be affected by rollbacks of the DB.

ariel-phet commented 8 years ago

priority seems appropriate setting @jonathanolson as assignee

jonathanolson commented 8 years ago

Proposal:

Use a URL like: https://phet.colorado.edu/services/add-html-translator?simName=molarity&locale=es&userId=245345&authorizationCode=insertHere, possibly with the same authorization code.

This would add the relevant entry to the user_localized_simulation_mapping table, looking up the matching HTML sim LocalizedTranslation id.

If the request returns a status code of 201, the translator entry was created, or 400 (bad request) if the entry already existed, OR if no matching user or translation is found.

@mattpen, can you implement this, and coordinate testing and integration with @jbphet?

jonathanolson commented 8 years ago

An authorization code OR checking to see if the request came from localhost should be required, so 3rd parties can't just randomly add entries to our database.

mattpen commented 8 years ago

@jonathanolson - If the user is not from localhost or does not provide an authorization code, should the request return 401 - unauthorized?

mattpen commented 8 years ago

authorizationCode=insertHere, possibly with the same authorization code.

@jonathanolson - Can you please clarify? Do you mean that simply the authorization code should be "insertHere"? Shouldn't we use something more secure, like a key stored in the db?

@jbphet - I wrote an initial draft for the website request. It should be ready to test.

jonathanolson commented 8 years ago

Do you mean that simply the authorization code should be "insertHere"?

Sorry, wasn't suggesting that. GitHub doesn't treat angle brackets nicely, so I didn't include them here. Randomly generated code sounds best.

jbphet commented 8 years ago

@mattpen - Is this on phet-dev, or am I testing on the main server?

mattpen commented 8 years ago

@jbphet - I can deploy it to phet-dev tonight. I was originally thinking we would test it on the VM first, but I don't think build-server is installed on the VM.

Do you have an example query that I could run on the VM to do a basic test before deploying on phet-dev?

mattpen commented 8 years ago

@jonathanolson - I'm assuming this is for HTML sims only. Can you confirm?

mattpen commented 8 years ago

@jbphet - This is now up on phet-dev. I'm logging details about any errors in the tomcat logs, let me know if you want to coordinate testing.

It should work via authorizationCode.
To get the code for phet-dev you can use this command:

echo 'cat //Context/Parameter[@name="add-html-translator-auth-code"]/@value' | xmllint --shell /usr/local/adm/config/tomcat/conf/context.xml | awk -F'[="]' '!/>/{print $(NF-1)}'
jbphet commented 8 years ago

Thanks @mattpen - I'll hopefully try this out within the next few days.

jonathanolson commented 8 years ago

It looks like this caused a JDBC sync error on the website when I tried to create a copy of a translation. I recall someone else reporting something similar.

Increasing priority(!), as it's now eating into my time (checking whether things were corrupted, whether IDs were skipped, etc.)

10 Mar 2016 22:33:59,025  WARN            HibernateUtils - exception:
java.lang.ArrayIndexOutOfBoundsException
10 Mar 2016 22:33:59,025  INFO            HibernateUtils - Attempting to roll back
10 Mar 2016 22:33:59,040  WARN     JDBCExceptionReporter - SQL Error: 0, SQLState: null
10 Mar 2016 22:33:59,040 ERROR     JDBCExceptionReporter - org.apache.tomcat.dbcp.dbcp.DelegatingPreparedStatement with address: "insert into transla
ed_string (key, value, createdAt, updatedAt, translation, id) values ('html5.donateToday', 'Doneer vandaag', '2016-03-06 07:03:23.167000 -07:00:00',
2016-03-06 07:03:23.167000 -07:00:00', 1700, 1418052)" is closed.
10 Mar 2016 22:33:59,041 ERROR AbstractFlushingEventListener - Could not synchronize database state with session
org.hibernate.exception.GenericJDBCException: Could not execute JDBC batch update
        at org.hibernate.exception.SQLStateConverter.handledNonSpecificException(SQLStateConverter.java:140)
        at org.hibernate.exception.SQLStateConverter.convert(SQLStateConverter.java:128)
        at org.hibernate.exception.JDBCExceptionHelper.convert(JDBCExceptionHelper.java:66)
        at org.hibernate.jdbc.AbstractBatcher.executeBatch(AbstractBatcher.java:275)
        at org.hibernate.jdbc.AbstractBatcher.prepareStatement(AbstractBatcher.java:114)
        at org.hibernate.jdbc.AbstractBatcher.prepareStatement(AbstractBatcher.java:109)
        at org.hibernate.jdbc.AbstractBatcher.prepareBatchStatement(AbstractBatcher.java:244)
        at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2395)
        at org.hibernate.persister.entity.AbstractEntityPersister.insert(AbstractEntityPersister.java:2858)
        at org.hibernate.action.EntityInsertAction.execute(EntityInsertAction.java:79)
        at org.hibernate.engine.ActionQueue.execute(ActionQueue.java:268)
        at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:260)
        at org.hibernate.engine.ActionQueue.executeActions(ActionQueue.java:179)
        at org.hibernate.event.def.AbstractFlushingEventListener.performExecutions(AbstractFlushingEventListener.java:321)
        at org.hibernate.event.def.DefaultAutoFlushEventListener.onAutoFlush(DefaultAutoFlushEventListener.java:64)
        at org.hibernate.impl.SessionImpl.autoFlushIfRequired(SessionImpl.java:1175)
        at org.hibernate.impl.SessionImpl.list(SessionImpl.java:1251)
        at org.hibernate.impl.QueryImpl.list(QueryImpl.java:102)
        at org.hibernate.impl.AbstractQueryImpl.uniqueResult(AbstractQueryImpl.java:859)
        at edu.colorado.phet.website.data.Translation$1.run(Translation.java:98)
        at edu.colorado.phet.website.data.Translation$1.run(Translation.java:96)
        at edu.colorado.phet.website.util.hibernate.HibernateUtils.transactionCore(HibernateUtils.java:714)
        at edu.colorado.phet.website.util.hibernate.HibernateUtils.resultCatchTransaction(HibernateUtils.java:662)
        at edu.colorado.phet.website.data.Translation.isPublished(Translation.java:96)
        at edu.colorado.phet.website.translation.TranslationListPanel$TranslationListView.populateItem(TranslationListPanel.java:145)
        at org.apache.wicket.markup.html.list.ListView.onPopulate(ListView.java:562)
        at org.apache.wicket.markup.repeater.AbstractRepeater.onBeforeRender(AbstractRepeater.java:131)
        at org.apache.wicket.Component.internalBeforeRender(Component.java:1065)
        at org.apache.wicket.Component.beforeRender(Component.java:1099)
        at org.apache.wicket.MarkupContainer.onBeforeRenderChildren(MarkupContainer.java:1753)
        at org.apache.wicket.Component.onBeforeRender(Component.java:3940)
        at org.apache.wicket.Component.internalBeforeRender(Component.java:1065)
        at org.apache.wicket.Component.beforeRender(Component.java:1099)
        at org.apache.wicket.MarkupContainer.onBeforeRenderChildren(MarkupContainer.java:1753)
        at org.apache.wicket.Component.onBeforeRender(Component.java:3940)
        at org.apache.wicket.Page.onBeforeRender(Page.java:1550)
        at edu.colorado.phet.website.templates.PhetPage.onBeforeRender(PhetPage.java:325)
        at org.apache.wicket.Component.internalBeforeRender(Component.java:1065)
        at org.apache.wicket.Component.beforeRender(Component.java:1099)
        at org.apache.wicket.Component.prepareForRender(Component.java:2282)
        at org.apache.wicket.Page.prepareForRender(Page.java:1540)
        at org.apache.wicket.Component.prepareForRender(Component.java:2319)
        at org.apache.wicket.Page.renderPage(Page.java:911)
        at org.apache.wicket.protocol.http.WebRequestCycle.redirectTo(WebRequestCycle.java:186)
        at org.apache.wicket.request.target.component.PageRequestTarget.respond(PageRequestTarget.java:58)
        at org.apache.wicket.request.AbstractRequestCycleProcessor.respond(AbstractRequestCycleProcessor.java:105)
        at org.apache.wicket.RequestCycle.processEventsAndRespond(RequestCycle.java:1258)
        at org.apache.wicket.RequestCycle.step(RequestCycle.java:1329)
        at org.apache.wicket.RequestCycle.steps(RequestCycle.java:1436)
        at org.apache.wicket.RequestCycle.request(RequestCycle.java:545)
        at org.apache.wicket.protocol.http.WicketFilter.doGet(WicketFilter.java:486)
        at org.apache.wicket.protocol.http.WicketFilter.doFilter(WicketFilter.java:319)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
        at edu.colorado.phet.website.services.URLFilter.doFilter(URLFilter.java:75)
        at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:241)
        at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:208)
        at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:220)
        at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:122)
        at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:501)
        at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:171)
        at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:103)
        at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:950)
        at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:116)
        at org.apache.catalina.valves.RemoteIpValve.invoke(RemoteIpValve.java:683)
        at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:408)
        at org.apache.coyote.ajp.AjpAprProcessor.process(AjpAprProcessor.java:188)
        at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:611)
        at org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.doRun(AprEndpoint.java:2440)
        at org.apache.tomcat.util.net.AprEndpoint$SocketProcessor.run(AprEndpoint.java:2429)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)
        at java.lang.Thread.run(Thread.java:745)
Caused by: java.sql.SQLException: org.apache.tomcat.dbcp.dbcp.DelegatingPreparedStatement with address: "insert into translated_string (key, value, c
eatedAt, updatedAt, translation, id) values ('html5.donateToday', 'Doneer vandaag', '2016-03-06 07:03:23.167000 -07:00:00', '2016-03-06 07:03:23.1670
0 -07:00:00', 1700, 1418052)" is closed.
        at org.apache.tomcat.dbcp.dbcp.DelegatingStatement.checkOpen(DelegatingStatement.java:137)
        at org.apache.tomcat.dbcp.dbcp.DelegatingStatement.executeBatch(DelegatingStatement.java:297)
        at org.apache.tomcat.dbcp.dbcp.DelegatingStatement.executeBatch(DelegatingStatement.java:297)
        at org.hibernate.jdbc.BatchingBatcher.doExecuteBatch(BatchingBatcher.java:70)
        at org.hibernate.jdbc.AbstractBatcher.executeBatch(AbstractBatcher.java:268)
        ... 70 more
jonathanolson commented 8 years ago

I've confirmed what looks like other examples of errors (see https://github.com/phetsims/website/issues/402) due to this, where translations are missing translators.

jbphet commented 8 years ago

Okay, I'll try to work on this today or next business day.

mattpen commented 8 years ago

@jbphet - The API method was deployed to figaro tonight and should be working.

jbphet commented 8 years ago

The code has been added to the build server to make it interact with the translation credits DB table via a URL, and I worked with @mattpen and @jonathanolson to test it out on phet-dev and it appeared to work. @mattpen has deployed the website-side functionality to figaro, and I just now (~12:30 pm MDT 3/14/2016) pulled the new build server code to figaro and restart the server. I will leave this GitHub issue open and assigned to me until I see a new translation submitted and built, and will work with @mattpen to verify that the interaction with the DB worked as expected.

jbphet commented 8 years ago

At around 5:24 AM MDT this morning (March 15th), a translation was submitted for John Travoltage for the pt_BR (Brazilian Portuguese) locale, and the build server log shows that it used the URL to try to submit the translation credit information. @mattpen - can you check and see if the credits information made it correctly into the DB? If so, this issue can be closed.

mattpen commented 8 years ago

http://phet.colorado.edu/en/for-translators/translation-credits#john-travoltage-header

@jbphet - Is the correct translator showing up on this page for this translation?

jbphet commented 8 years ago

Yes, that looks correct. I also verified another that happened overnight. I'll take this to mean that the issue can be resolved. Closing.

jbphet commented 8 years ago

I just checked through the build server logs, and it looks like a significant proportion of the attempts to add translator credits are returning with a value of 400 instead of 201 (36 out of 96, or about 38%). A return code of 400 apparently means "bad request", so I think we need to dig into this and figure out why there are this many apparent failures. The build server log contains the URL and the time at which it was submitted, so this data can be provided if needed. Assigning to @mattpen for investigation of website side of the equation, @jonathanolson is welcome to jump in if desired. I'll tag for dev meeting so we can discuss and prioritize if no one has gotten to it by then.

jonathanolson commented 8 years ago

If it helps, one of the Tomcat log statements from one of those:

04 Apr 2016 08:46:50,548  INFO         AddHtmlTranslator - null value or conflict
jonathanolson commented 8 years ago

Looks like that's the response when a translator is repeatedly added.

The first time, it succeeds. Then every time after, they are already marked as a translator.

We could potentially move "success = true" outside the check for !simTranslations.contains( localizedSimulation) so that this is treated as a success?

jbphet commented 8 years ago

Discussed in the 4/7/2016 developer meeting, decided that this isn't a high-priority issue since the data in the DB is fine. This can be addressed after the phet-io and new server work is done.

mattpen commented 6 years ago

The build-server is no longer receiving the error above because I improved the logging in August 2016. However, there are still occasional errors (30 since the switch to phet-server from figaro). Here's the complete list:

01 Sep 2016 13:17:53,207  INFO         AddHtmlTranslator - Unauthorized attempt to add translation
10 Sep 2016 21:26:50,105  INFO         AddHtmlTranslator - Not found:  locale:zh_CN
12 Sep 2016 10:31:46,645  INFO         AddHtmlTranslator - Not found:  locale:de
14 Sep 2016 01:42:10,480  INFO         AddHtmlTranslator - Not found:  locale:ms
23 Sep 2016 03:55:09,553  INFO         AddHtmlTranslator - Not found:  locale:hu
09 Oct 2016 08:39:35,924  INFO         AddHtmlTranslator - Unauthorized attempt to add translation
09 Oct 2016 18:40:33,375  INFO         AddHtmlTranslator - Not found:  locale:es
09 Oct 2016 21:41:34,814  INFO         AddHtmlTranslator - Unauthorized attempt to add translation
10 Oct 2016 01:58:49,819  INFO         AddHtmlTranslator - Unauthorized attempt to add translation
10 Oct 2016 09:14:02,575  INFO         AddHtmlTranslator - Unauthorized attempt to add translation
30 Jan 2017 00:23:04,397  INFO         AddHtmlTranslator - Unauthorized attempt to add translation
07 Feb 2017 06:30:37,894  INFO         AddHtmlTranslator - Not found:  locale:mr
07 Feb 2017 06:37:21,220  INFO         AddHtmlTranslator - Not found:  locale:mr
03 Mar 2017 05:26:15,289  INFO         AddHtmlTranslator - Not found:  locale:mr
21 Mar 2017 15:01:52,326  INFO         AddHtmlTranslator - Not found:  locale:eu
12 Apr 2017 08:19:02,056  INFO         AddHtmlTranslator - Unauthorized attempt to add translation
13 Apr 2017 14:41:23,843  INFO         AddHtmlTranslator - Not found:  locale:or
31 May 2017 11:01:30,394  INFO         AddHtmlTranslator - Unauthorized attempt to add translation
19 Jun 2017 16:04:04,941  INFO         AddHtmlTranslator - Not found:  locale:ja
21 Jul 2017 11:54:21,741  INFO         AddHtmlTranslator - Unauthorized attempt to add translation
24 Jul 2017 21:55:18,651  INFO         AddHtmlTranslator - Not found:  locale:ru
05 Aug 2017 10:28:30,616  INFO         AddHtmlTranslator - Not found:  locale:sr
05 Aug 2017 10:32:30,465  INFO         AddHtmlTranslator - Not found:  locale:sr
14 Aug 2017 22:34:59,458  INFO         AddHtmlTranslator - Unauthorized attempt to add translation
15 Aug 2017 02:42:40,544  INFO         AddHtmlTranslator - Not found:  locale:kk
15 Aug 2017 13:17:14,806  INFO         AddHtmlTranslator - Not found:  locale:nb
15 Aug 2017 13:18:00,870  INFO         AddHtmlTranslator - Not found:  locale:nb
20 Aug 2017 06:45:32,120  INFO         AddHtmlTranslator - Not found:  locale:hi
20 Oct 2017 04:47:10,626  INFO         AddHtmlTranslator - Not found:  locale:mn

I'm not sure what would be causing the unauthorized attempts, maybe some web crawler grabbed the URL from GitHub? The Locale Not Found indicates that a particular localized simulation is missing, however, each of these locales has many translated sims. I'll need to improve the logging on the website to understand the failures, but they don't appear to be causing any problems and it could be months before we see any results.

This work should not be part of chipper 2.0.

mattpen commented 6 years ago

This work was addressed as part of the updates for add translator in chipper 2.0 and is no longer a problem. Closing.