ckan / datapusher

A standalone web service that pushes data files from a CKAN site resources into its DataStore
GNU Affero General Public License v3.0
77 stars 154 forks source link

Dangling response on raise util.JobError #212

Open lmmaia opened 4 years ago

lmmaia commented 4 years ago

Hi guys.

So been testing with CKAN in different platforms and currently I have two cases where CKAN just blocks after upload a large file. I was able to trace back to the datapusher and eventually saw it block on:

if cl and int(cl) > MAX_CONTENT_LENGTH: raise util.JobError(..)

So I made fork of CKAN and added with the latest's datapusher and made it build with docker-compose and its available here: https://github.com/lmmaia/ckan/tree/2.9

Also this uses postgresql, very lazy build but I think it works to exclude other possible known errors with sqllite.

To overcome the error I just replaced the raise util.JobError with a Exception(...), but still I couldn't figure out why the dangling just in some machines.

Best regards.

mbocevski commented 4 years ago

@lmmaia what do you mean by dangling response? Does an exception gets raised at all, and if yes can you paste the exception here? Datapusher sets a DOWNLOAD_TIMEOUT on the requests, so the jobs should either throw an exception or time out.

lmmaia commented 4 years ago

Hi, thanks for the response. No, it just blocks. In both machines the CKAN just gets unresponsive until datapusher is restarted.

To try to replicate just build CKAN 2.9 (tag) with docker compose from repo and upload a file larger than 10MB.

It shouldn't matter but I'm running the containers in 3 different machines one virtual machine with CentOS 8, other with CentOS 7 and a physical one with Ubuntu 18.04. The CentOS 7 is the only one that is working correctly. All with docker and docker-compose in the latest version.

I'll explore in depth this later on this week.

mbocevski commented 4 years ago

@lmmaia I'm assuming you have connectivity between the nodes and there are no timeout issues? I would suggest that you make sure that all the different hosts and components can talk to each other based on the configuration you have in CKAN and datapusher.

luismaiaDEVSCOPE commented 3 years ago

@mbocevski Sorry for the late reply but still I'm facing a lot of issues with datastore on my use case. I'm quiting the older version that is on the docker-compose of CKAN.

I have seen that you had a Dockerfile to build an more recent version of Datapusher, so I tried to use and image generated with that Dockerfile https://github.com/ckan/datapusher/pull/210/commits/1c5bcc6e76074fd0f88120559465bad3368cfb52 to connect that image of the datapusher to the ckan 2.9.1. For some reason the datapusher receives the request but it does not returns anything.

image

Can you help?

Thanks very much for the time in any case ;)

mbocevski commented 3 years ago

@luismaiaDEVSCOPE check out https://github.com/keitaroinc/docker-ckan, we build docker images for CKAN and datapusher, where datapusher uses our fork which has features that are not merged upstream such as https://github.com/ckan/datapusher/pull/206 which is needed for a setup like in docker where datapusher can't communicate with CKAN using SITE_URL.

Check out and run our docker-compose setup from our repo and see if you have any issues.

luismaiaDEVSCOPE commented 3 years ago

@mbocevski How is this not the default install method for CKAN?! Even the plugin docker extension is spot on. Thanks really good.

If you guys have some challenges that need more people to help evolve just let me know.

mbocevski commented 3 years ago

@luismaiaDEVSCOPE we've been doing this for many years and a lot of folks know about our repo. Feel free to contribute and raise issues, we are always open to collaboration.