Open jywarren opened 5 years ago
@jywarren Let's focus on @Divy123 's work on https://github.com/publiclab/image-sequencer/pull/1093 Divy, has found out that even disabling pace based progress reporting somehow does not solve the memory leak issues associated with it. Let's make sure that is working and if we're lucky that would solve our issues here too! Fingers crossed!! :smile:
Okay I updated the dependency! Let's see if this works now! @jywarren @icarito Could you please redeploy??
Oh, I thought you'd worked out how to redeploy. Can you ping @icarito in the chatroom for help then? Sorry I misunderstood!
On Fri, Jun 21, 2019, 9:51 PM Varun Gupta notifications@github.com wrote:
Okay I updated the dependency! Let's see if this works now! @jywarren https://github.com/jywarren @icarito https://github.com/icarito Could you please redeploy??
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/publiclab/image-sequencer-app/issues/19?email_source=notifications&email_token=AAAF6JZ6LZ6WEDPMPBKCLSTP3WAQZA5CNFSM4HWZEIJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYJ5RCQ#issuecomment-504617098, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAF6JZGPA2XTS3PM2YGVLDP3WAQZANCNFSM4HWZEIJA .
Hi! I figured out how to reliably autodeploy here's a PR: https://github.com/publiclab/image-sequencer-app/pull/20 :rocket:
Awesome! I'll merge it!!
@icarito Is the latest version now deployed??
Maybe best ask in the chatroom?
On Sun, Jun 23, 2019, 8:08 AM Varun Gupta notifications@github.com wrote:
@icarito https://github.com/icarito Is the latest version now deployed??
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/publiclab/image-sequencer-app/issues/19?email_source=notifications&email_token=AAAF6J23UIBIMVKQKBTPVBTP35RRRA5CNFSM4HWZEIJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYK44RQ#issuecomment-504745542, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAF6J72JEVNJTKCAEJAC23P35RRRANCNFSM4HWZEIJA .
Oh, okay! Let's work on this tomorrow though. Great game going on!π
@jywarren I just found out the underlying issue! I ran a large map on my local machine and this happened
<--- Last few GCs --->
[72306:0x103800000] 310156 ms: Mark-sweep 1224.8 (1231.4) -> 1224.8 (1231.4) MB, 11.3 / 0.0 ms (average mu = 0.791, current mu = 0.001) last resort GC in old space requested
[72306:0x103800000] 310166 ms: Mark-sweep 1224.8 (1231.4) -> 1224.8 (1231.4) MB, 10.1 / 0.0 ms (average mu = 0.683, current mu = 0.001) last resort GC in old space requested
<--- JS stacktrace --->
==== JS stack trace =========================================
0: ExitFrame [pc: 0x24b0e57dbe3d]
1: StubFrame [pc: 0x24b0e579059d]
Security context: 0x090ec5b1e6e9 <JSObject>
2: replace [0x90ec5b105e9](this=0x090e5859bab1 <Very long string[199723668]>,0x090e5859bb01 <String[8]: output>9>,0x090ee7102201 <Very long string[24912228]>)
3: process [0x90e3162a931] [/Users/varugupta/Desktop/image-sequencer-app/src/api/v2/index.js:122] [bytecode=0x90ea515d861 offset=187](this=0x090e2c68d461 <...
FATAL ERROR: CALL_AND_RETRY_LAST Allocation failed - JavaScript heap out of memory
1: 0x10003cf99 node::Abort() [/usr/local/bin/node]
2: 0x10003d1a3 node::OnFatalError(char const*, char const*) [/usr/local/bin/node]
3: 0x1001b7835 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [/usr/local/bin/node]
4: 0x100585682 v8::internal::Heap::FatalProcessOutOfMemory(char const*) [/usr/local/bin/node]
5: 0x10058eb84 v8::internal::Heap::AllocateRawWithRetryOrFail(int, v8::internal::AllocationSpace, v8::internal::AllocationAlignment) [/usr/local/bin/node]
6: 0x100560435 v8::internal::Factory::NewRawOneByteString(int, v8::internal::PretenureFlag) [/usr/local/bin/node]
7: 0x100692548 v8::internal::String::SlowFlatten(v8::internal::Handle<v8::internal::ConsString>, v8::internal::PretenureFlag) [/usr/local/bin/node]
8: 0x1006b18f3 v8::internal::String::IndexOf(v8::internal::Isolate*, v8::internal::Handle<v8::internal::String>, v8::internal::Handle<v8::internal::String>, int) [/usr/local/bin/node]
9: 0x10083777f v8::internal::Runtime_StringIndexOfUnchecked(int, v8::internal::Object**, v8::internal::Isolate*) [/usr/local/bin/node]
10: 0x24b0e57dbe3d
11: 0x24b0e579059d
12: 0x24b0e5c346da
[1] 72305 abort npm run start
This looks like bounding the number of processes didn't really work! It's still processing everything concurrently! I would have to debug and see why this is happening!!
Or maybe it did work and we just can't hold that much data in js heap memory!!
We might have to write these files onto disk and then pick them up one by one when overlaying!!
Alright!!!! :tada: :tada: This is working!!! So my doubts were right! The page failing was a browser thing, since the browser places a new request every few seconds, it kind of baffles the server! I tried it with the new upload option, on a local machine and increased the allowed memory with a flag --max-old-space-size=8192, this gives 8gb ram to the node process! I monitored it and it was maxing at around 3 gb!! And it worked!! @jywarren You can see it in the bucket as export73587.html :smiley:
You can find the file here https://console.cloud.google.com/storage/browser/mapknitter-is?project=public-lab&authuser=1
It's so big that even screenshot won't upload to github! Yikes!
@icarito What is the max memory we can use on the cluster for our purposes??
Alright!!!! π π This is working!!!
That looks great. So, we need to:
increased the allowed memory with a flag
--max-old-space-size=8192
, this gives 8gb ram to the node process! I monitored it and it was maxing at around 3 gb!!
But you're asking what the max is so you can set it to that?
Do we need to lower the memory footprint by storing the files temporarily, or staggering when we load and unload them from memory?
Hmm, I think we can store files temporarily, but that would introduce considerable delays. I think we can set the max memory, since for this map it peaked at 2.3gb so I think if we have 8 gb of memory, it should be fine!
Ok great let's try that first.
On Mon, Jun 24, 2019, 1:59 PM Varun Gupta notifications@github.com wrote:
Hmm, I think we can store files temporarily, but that would introduce considerable delays. I think we can aet the max memory, since for this map it peaked at 2.3gb so I think if we have 8 gb of memory, it should be fine!
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/publiclab/image-sequencer-app/issues/19?email_source=notifications&email_token=AAAF6J2TZWFXJCC3S43HMDTP4EDNPA5CNFSM4HWZEIJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYNXGJY#issuecomment-505115431, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAF6J47QQNVHLCKA4RUSWLP4EDNPANCNFSM4HWZEIJA .
I don't have my laptop with me, I'll push this code in as soon as I can!! We should now discuss how to store gcloud secrets on the cloud though.
OK, let's start a new issue for that, using these docs as a starting point:
https://cloud.google.com/kms/docs/secret-management https://cloud.google.com/nodejs/docs/tutorials/bookshelf-on-kubernetes-engine#creating_a_cloud_storage_bucket
However, if it's being run from inside Google Kubernetes Engine, won't it already be configured as an authenticated user, and may not need additional secrets added?
On Mon, Jun 24, 2019 at 2:03 PM Varun Gupta notifications@github.com wrote:
I don't have my laptop with me, I'll push this code in as soon as I can!! We should now discuss how to store gcloud secrets on the cloud though.
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/publiclab/image-sequencer-app/issues/19?email_source=notifications&email_token=AAAF6J3SZNZXWNGOQ7OAIW3P4ED6PA5CNFSM4HWZEIJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYNXTAI#issuecomment-505117057, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAF6J5RJJ2U7R3PXWNQB6LP4ED6PANCNFSM4HWZEIJA .
Well, I thought so too! But how do we access the credentials inside the container then?
In Ruby the authentication is a separate step than upload/download, so I wonder if you can just skip it? Or use locally available credentials like a Json key available locally? It should be in the docs!
I'll have a look at the docs today!! :smile:
Okay!! It looks like all we have to do is add a config.json file with the bucket's name and we should be able to access it directly without any secrets!! Awesome!
AWESOME!
On Tue, Jun 25, 2019 at 1:14 AM Varun Gupta notifications@github.com wrote:
Okay!! It looks like all we have to do is add a config.json file with the bucket's name and we should be able to access it directly without any secrets!! Awesome!
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/publiclab/image-sequencer-app/issues/19?email_source=notifications&email_token=AAAF6J3PHSRAYJEPFRHSNQ3P4GSSPA5CNFSM4HWZEIJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYPAZOI#issuecomment-505285817, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAF6J365ICUU2RFAWQMXOTP4GSSPANCNFSM4HWZEIJA .
Okay 2 things here:
@jywarren Do you think Sebastian can help with the second part?
Okay, I have removed the code which was causing the exception. All we would need now are cloud credentials in the container and it should start to work. I have also updated the start command to use up to 8gb of memory.
Hi @tech4GT we can pass the credentials to the app using environment variables. Here's how it was done in our Ruby app: https://github.com/publiclab/mapknitter-exporter-sinatra/blob/main/files.yml Name and use variables, and I'll be generaring the keys and adding them to the deployment environment.
Awesome, thanks @icarito -- @tech4GT does this make sense?
Hi both! Yeah that makes sense! Just super busy with stuff! Not sure if I'll be going home this weekend but I'll try to!
Thanks, good luck with your work, and looking forward to it. We've just got the very first export run from within Leaflet.DistortableImage working yesterday on the Ruby exporter, so it'll be exciting to offer this option as well! Thanks Varun!
On Tue, Jul 2, 2019 at 11:35 AM Varun Gupta notifications@github.com wrote:
Hi both! Yeah that makes sense! Just super busy with stuff! Not sure if I'll be going home this weekend but I'll try to!
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/publiclab/image-sequencer-app/issues/19?email_source=notifications&email_token=AAAF6J4SIB5PSBQCUHDZW6DP5NYT5A5CNFSM4HWZEIJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODZBVTIA#issuecomment-507730336, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAF6J4GZS46YGQVFO56SRTP5NYT5ANCNFSM4HWZEIJA .
Looks like it automatically pulls the config from the environment! @icarito
OK, should i be using these now?
And got Your Image is exporting, to load Image please visit, http://34.74.118.242/api/v2/status/?pid=28371
but that last status URL took a while to load. Shouldn't it be pretty quick?
Hmm, the status URL is still showing Still working on it
Yeah we still have to add the credentials as environment variables! I think as soon as @icarito adds the credentials in the environment the npm module should automatically use them for authentication! That's what I figured from the documentation!
It is saying still working on it
because we are unable to upload images to the bucket without authentication!
I have an idea, I'll enable the gCloud upload via a query parameter! And for now add an option to export it as a local file! So you can try it out on your local machine. Since we would not want this code to be pushed to the container though, I am opening a new dev branch for this.
I'll document this more in the issue I am opening!
Hm, ok i ran this: http://34.74.118.242/api/v2/export/?url=https://mapknitter.org/maps/nottingham-city-centre/warpables.json
and after the redirect, got:
This site canβt be reached 34.74.118.242 refused to connect.
Search Google for api process
ERR_CONNECTION_REFUSED
``
I reloaded again a bit later, and got:
This page isnβt working 34.74.118.242 didnβt send any data.
ERR_EMPTY_RESPONSE
For reference, here is the URL: https://gist.github.com/jywarren/9b8224cdcbd2a50d9adfe097d5a8d609
Hi @jywarren I've been trying to reach Sebastian, maybe he can help with this? Actually I kind of ran into some trouble at the university this week. I'll be on this Monday morning! Anyway this looks like an issue with the deployment, would it possible for you to run this locally for now?
Hi, I analyzed again how to achieve this and am a little bit confused with the process (creating a service account, granting privileges (done, done) - downloaded json key (done)...
now I need to create a kubernetes "secret" and then tell the container to load this secret as a file and then add a reference as an environment variable and test... sorry it turned out to be complex IMHO.
OK, thank you! @icarito do you need more information or input from @tech4gt or just a little more time? Thank you!
On Thu, Aug 15, 2019 at 3:08 AM Sebastian Silva notifications@github.com wrote:
Hi, I analyzed again how to achieve this and am a little bit confused with the process (creating a service account, granting privileges (done, done) - downloaded json key (done)...
now I need to create a kubernetes "secret" and then tell the container to load this secret as a file and then add a reference as an environment variable and test... sorry it turned out to be complex IMHO.
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/publiclab/image-sequencer-app/issues/19?email_source=notifications&email_token=AAAF6J7LRAZLI2RMPZRCLADQET6FVA5CNFSM4HWZEIJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4LBEAA#issuecomment-521540096, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAF6J34V4CPCQQVDP5KO2LQET6FVANCNFSM4HWZEIJA .
Sorry for the delay. For reference next time it'd be great to indicate what environment variable is needed and what it needs to contain! In this case I believe it should be GOOGLE_APPLICATION_CREDENTIALS and it should point to the json credentials of a google cloud platform service account (from https://cloud.google.com/storage/docs/reference/libraries). Give me a little more time to figure out where to store it as a kubernetes secret and deploy and test it. Thanks!
OK, i see 2 issues.
First, the google cloud storage credentials, which we think we can solve similarly to how we did the Sinatra Ruby app: https://github.com/publiclab/mapknitter-exporter-sinatra/blob/b065d1de02730347255d1df4c55c9368ca53ec60/app.rb#L53
Second, the issue with the app not connecting. What's interesting here is:
warpables.json
file, and This site canβt be reached 34.74.118.242 refused to connect. ERR_CONNECTION_REFUSED
(One observation: that long URL has ?upload=undefined&steps=...
so perhaps it's entangled with the upload parameter, but it seems to also hang the same way if I remove that parameter entirely, so maybe that's not an issue.)
Anyways, @icarito had a good idea that perhaps the process is taking really long and not answering because it's running, and then we get a timeout. In the Ruby exporter, we solved this by forking the process and returning a unique URL for where to go to read the status (and eventually download the output). That allows the process to run its course and we can just get a quick response with a URL to refresh from, to keep checking for the output file. Once the process is complete, the file is uploaded into the same folder in Google Storage, and we update a status.json file (example: https://mapknitter-exports-warps.storage.googleapis.com/1565813711/status.json)
So, if we get storage working, we can create a status.json
file in the same format, and also upload the output into the same bucket in Cloud Storage, and we can reply to the request immediately.
Hey!
I was able to get it to export to the bucket but first I had to figure out an issue, that the upload
variable ends up being undefined
unless one sets &upload=true
as a GET parameter.
Now it seems to work saving to the maknitter-is
bucket.
Awesome! This works like a charm!! :100: Thanks a lot for your help @icarito
Looking great -- i think we need to increment the pid
though as it's
getting overwritten mapybe?
Could we use a more unique id?
On Tue, Aug 20, 2019 at 9:58 AM Varun Gupta notifications@github.com wrote:
Awesome! This works like a charm!! π― Thanks a lot for your help @icarito https://github.com/icarito
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/publiclab/image-sequencer-app/issues/19?email_source=notifications&email_token=AAAF6J2UHXMXHLA3ZPUN3ETQFP2ABA5CNFSM4HWZEIJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4WMJNY#issuecomment-523027639, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAF6J7SZDV3NPYF7FDYGP3QFP2ABANCNFSM4HWZEIJA .
Looking great -- i think we need to increment the
pid
though as it's getting overwritten mapybe? Could we use a more unique id? β¦ On Tue, Aug 20, 2019 at 9:58 AM Varun Gupta @.***> wrote: Awesome! This works like a charm!! Thanks a lot for your help @icarito https://github.com/icarito β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#19?email_source=notifications&email_token=AAAF6J2UHXMXHLA3ZPUN3ETQFP2ABA5CNFSM4HWZEIJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4WMJNY#issuecomment-523027639>, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAF6J7SZDV3NPYF7FDYGP3QFP2ABANCNFSM4HWZEIJA .
Hmm, right now we are calculating ids like this
pid = require('process').pid + "" + Math.round(Math.random() * 1000);
Do you think we might run out? I mean it would take thousand exports on the same Pid for it run out. Or maybe I can check and if there's a clash we can regenerate!
I think that should be good, although perhaps we can put some unique id from the cluster instance + timestamp or something? We can revisit later but in theory more than one export could conceivably be run within a second, so timestamp + 1000x random should be good but we should be able to think of something more truly unique!
On Tue, Aug 20, 2019 at 11:00 AM Varun Gupta notifications@github.com wrote:
Looking great -- i think we need to increment the pid though as it's getting overwritten mapybe? Could we use a more unique id? β¦ <#m1643449967606604490> On Tue, Aug 20, 2019 at 9:58 AM Varun Gupta @.***> wrote: Awesome! This works like a charm!! Thanks a lot for your help @icarito https://github.com/icarito https://github.com/icarito β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <#19 https://github.com/publiclab/image-sequencer-app/issues/19?email_source=notifications&email_token=AAAF6J2UHXMXHLA3ZPUN3ETQFP2ABA5CNFSM4HWZEIJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4WMJNY#issuecomment-523027639>, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAF6J7SZDV3NPYF7FDYGP3QFP2ABANCNFSM4HWZEIJA .
Hmm, right now we are calculating ids like this
pid = require('process').pid + "" + Math.round(Math.random() * 1000);
Do you think we might run out? I mean it would take thousand exports on the same Pid for it run out. Or maybe I can check and if there's a clash we can regenerate!
β You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/publiclab/image-sequencer-app/issues/19?email_source=notifications&email_token=AAAF6J7WROUVLPWYED3MGTLQFQBHJA5CNFSM4HWZEIJKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD4WS74Y#issuecomment-523055091, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAF6J56A2EBASJ4KKJYBH3QFQBHJANCNFSM4HWZEIJA .
Now, building on the list from https://github.com/publiclab/mapknitter-exporter-sinatra/issues/23, let's see how big images/maps it can handle!
http://34.74.118.242/api/v2/export/?url=http://mapknitter.org/maps/pvdtest/warpables.json&scale=30 is real fast. seconds.
http://34.74.118.242/api/v2/export/?url=http://mapknitter.org/maps/ceres--2/warpables.json&scale=30 -- It worked!!!! 25mb output in <2 minutes! Too big even to upload to GitHub! https://www.dropbox.com/s/m6v131cyynlwx0a/ceres--2.png?dl=0
(note that scale isn't working now -- #14 )
http://34.74.118.242/api/v2/export/?url=http://mapknitter.org/maps/irish-uk-border-mapping/warpables.json&scale=30 next:
Then: