Closed Mbrownshoes closed 6 years ago
@ll911 This error seems to be caused when the resource is coming from a proxy. pdf resource https://catalogue.data.gov.bc.ca/dataset/caribou-habitat-model-for-the-western-cariboo-region-2001/resource/b8809199-96e9-4415-b8b0-899eb8411118/proxy
this is defined by ckan.resource_proxy.max_file_size
, default is 1MB, changing this parm will impact performance, need some justification if need to change the value.
@dkelsey If this is an enhancement, what am i enhancing it to? Is this now a duplicate of #372
@garrettH3S It's related yes. The behavior I would expect would be if the PDF is larger than 1048576 bytes no preview is displayed.
I created a dataset to verify this issue:
@ll911 :
ckan.resource_proxy.max_file_size
is not set in CAD. 1MB
@ll911 should ckan.resource_proxy.max_file_size
be configure in CAD?
I looked at the current behavior in PROD
and forcing a PDF
to upload to the DataStore
gets the same pdftables in not installed
as listed below.
I take this to mean we do no store PDFs
in the DataStore
.
PDFs
are saved only to the FileStore
1.8MB
the other is 18.5MB
Error: File "/apps/ckan/tst/datapusher/lib/python2.7/site-packages/apscheduler/scheduler.py", line 512,
in _run_job retval = job.func(*job.args, **job.kwargs) File "/apps/cis/workspace/bcdc/bcdc-
rc/src/datapusher/datapusher/jobs.py", line 404, in push_to_datastore table_set =
messytables.any_tableset(tmp, mimetype=ct, extension=ct) File
"/apps/ckan/tst/datapusher/lib/python2.7/site-packages/messytables/any.py", line 137, in any_tableset
return parsers[attempt](fileobj, **kw) File "/apps/ckan/tst/datapusher/lib/python2.7/site-
packages/messytables/pdf.py", line 50, in __init__ raise ImportError("pdftables is not installed")
ImportError('pdftables is not installed',)
CAD
there is another error--'Could not connect to DataPusher'
Error: Resource too large to process: 10492139 > max (10485760).
CAD
there is another error.@dkelsey this issue is known to be not working. Dave, do you want us to continue working on it, or do you want to do more testing first?
@jeff-at-h3 I realized a while ago that I had the sizes wrong. the max size in 150 MB
I'm testing this today.
This is a known issue...not a new one. Lets asses priorities. Hold off working on this for now.
figured out the test case
CAD, CAT, PROD
'Upload'
use 'Link'
This has been more clearly described in #470
See TESTCASE below further in this issue.
We are unable to preview this draft record's pdf resource (results in sever error) https://catalogue.data.gov.bc.ca/dataset/test-geo-1-2/resource/bf5e743a-4320-40dc-8a68-7bdbb11bf064 however, the download does work. This resource is uploaded to the catalogue, though pdfs linked to have experienced the same problem. We also experienced this problem with https://catalogue.data.gov.bc.ca/dataset/caribou-habitat-model-for-the-eastern-cariboo-region-columbia-highlands-northern-columbia-mountains, so I moved the pdf to the description.
Other pdf previews do work, such as this (which is a resource in the same record as above) https://catalogue.data.gov.bc.ca/dataset/test-geo-1-2/resource/1e928ab5-cdd9-4769-8be7-c84ef6944e5c
This behaviour is inconsistent, and I was able to see pdf previews in cat in cad for newly (and old) created resources.
@ll911 suggests solr might need to be upgraded to fix. @dkelsey