Closed seanankenbruck closed 5 years ago
@seanankenbruck I'm working on the problem right now. I will keep you posted on the progress.
@seanankenbruck ,
I would like you to add these lines to /opt/sas/viya/config/consul/proxy.conf.ctmpl and restart HTTPD service:
Timeout 2400 ProxyTimeout 2400 ProxyBadHeader Ignore
Will you be able to reproduce the problem after that? If yes, please try to connect to port 8777 instead of 443, will you be able to reproduce the problem?
Hi: Sorry I have been tied up with other stuff and missed this question. Are you running the requests from nodejs or from a browser? @seanankenbruck - were you able to successfully run it with your changes?
@seanankenbruck - one more question - are you running the jobs in one cas session or multiple cas sessions?
@seanankenbruck I am attaching a nodejs code that uploads files in parallel using store.submit method. Please give it a try. I tested upto 50 files. uploaded cars.csv 50 times but each submit created a different table. you will have to modify the code to load different files (you can create an array of files names and use that to make minimal changes to the code. Let me know if you have issues. For some reason I am not getting notification when issues are posted. I will check with my admin about that. Cheers,,, Deva PS: again sorry for the slow response. filelimit.zip
@seanankenbruck ,
I would like you to add these lines to /opt/sas/viya/config/consul/proxy.conf.ctmpl and restart HTTPD service:
Timeout 2400 ProxyTimeout 2400 ProxyBadHeader Ignore
Will you be able to reproduce the problem after that? If yes, please try to connect to port 8777 instead of 443, will you be able to reproduce the problem?
I've added the requested lines and have kicked off the test. Will update this post again on Monday.
@seanankenbruck - one more question - are you running the jobs in one cas session or multiple cas sessions?
@devaKumaraswamy I've written a bash script that iterates through the rows of a csv file where each row contains the name of a table loaded into cas. For each row the script calls a node.js function that drops and reloads the table in cas. Since each call is managed by an independent process it should spawn multiple cas sessions, correct?
The program works great for small tables but large tables, of 50 million+ rows, eventually timeout and receive the 500 level errors.
@seanankenbruck _ yes - each nodejs will result in a new session. If you are doing that way make sure you also do a store.apiCall( session.links('delete') ) to delete the session - keeps the load on the server to a minimum. The only difference between what you are doing and what my sample code did was my code runs inside a single nodejs function - but that will also run into the size issue.
As far as I can tell there is a limit to the size of the file (in bytes) that can be loaded and this setting is set on the server. I am not sure where one sets it (probably the same place that @alexal discussed). I will ask our internal experts on it. How big is your csv in bytes?
Sounds like you need an alternate way to load very large files. Might have to use something like proc casutil
I will ask the experts here but meanwhile you can also post to the tech support site - they might have an answer already.
One last ugly option is to break up the csv file and run datastep on cas to append the data - this is an option if for some reason you cannot run proc casutil.
I am still not getting notifications when someone posts here - I will have to ping my admin again. Sigh!
@devaKumaraswamy We are actually loading/unloading a Postgres table through an ODBC connection in Viya.
@seanankenbruck To be clear the 50million use case is from Postgres.
@seanankenbruck if you are using table.upload with postgres data source option and it is failing, can you post it to SAS support? They might have some tips on how to handle your case. Let me know if I am misinterpreting your problem.
Sure, I'll continue this conversation with tech support. It doesn't appear to be a bug with the library but more of a server issue. Thank you.
There appears to be a hard limit on the number of concurrent requests that can be made against the CAS server via the restAF library and it seems like that limit is 30. o When you try to load 30+ tables at once a 500 or 504 Server error is returned every time and the tables do not get loaded o When you load 29 or less tables at once the code executes successfully and I don't receive a 500 error, not even one time
My question is whether or not there is a way to increase this request limit in the restAF configuration settings or can the limit be increased on the CAS server itself?