conveyal / analysis-backend

Server component of Conveyal Analysis
http://conveyal.com/analysis
MIT License
23 stars 12 forks source link

Error when uploading Opportunity dataset #247

Closed arunaxp closed 4 years ago

arunaxp commented 4 years ago

Hi @evansiroky @ansoncfit @abyrd @trevorgerhardt

Could you please someone help me to figure out the error thrown when I try to upload opportunity dataset. I'm trying to test this on my local pc

Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 8377E5DE5CBBFF30; S3 Extended Request ID: kFwEbrXg9cu03d0bgX8xHQDJxUAvIS02UlmTUUGanmh25npM0NtgbdyZrsJw8yf+03Y4D7LoY9M=) com.amazonaws.services.s3.model.AmazonS3Exception: Access Denied (Service: Amazon S3; Status Code: 403; Error Code: AccessDenied; Request ID: 8377E5DE5CBBFF30; S3 Extended Request ID: kFwEbrXg9cu03d0bgX8xHQDJxUAvIS02UlmTUUGanmh25npM0NtgbdyZrsJw8yf+03Y4D7LoY9M=), S3 Extended Request ID: kFwEbrXg9cu03d0bgX8xHQDJxUAvIS02UlmTUUGanmh25npM0NtgbdyZrsJw8yf+03Y4D7LoY9M= at com.amazonaws.http.AmazonHttpClient$RequestExecutor.handleErrorResponse(AmazonHttpClient.java:1632)

arunaxp commented 4 years ago

even though I have try to setup on my pc (offline=true),it seems that it try to access S3 bucket, In the Read-Me file describe grid_bucket: S3 bucket for storing opportunity dataset grids but I couldn't file anywhere in the property file.

This is the Conveyal properties file

This file contains the configuration options for Conveyal Analysis

Auth0 authentication configuration. The values will be ignored when configuration option offline=true.

auth0-client-id=X auth0-secret=Y

Access group for administrators

admin-access-group=OFFLINE

The host and port of the remote Mongo server (if any). Comment out for local Mongo instance.

database-uri=mongodb://127.0.0.1:27017

The name of the database in the Mongo instance.

database-name=analysis-backend-test1

The URL where the frontend is hosted.

In production this should point to a cached CDN for speed. e.g. https://d1uqjuy3laovxb.cloudfront.net

In staging this should be the underlying S3 URL so files are not cached and you see the most recent deployment.

frontend-url=http://localhost:3000

S3 buckets where Analysis inputs and results are stored.

bundle-bucket=analysis-staging-bundles grid-bucket=analysis-staging-grids results-bucket=analysis-staging-results

The URL from which Analysis can fetch OSM extracts on demand using Conveyal Vanilla Extract.

vex-url=http://osm.conveyal.com/vex

The S3 bucket where we can find tiles of the entire US census, built with Conveyal seamless-census.

seamless-census-bucket=lodes-data-2014 seamless-census-region=us-east-1

When offline is true, authentication and other services are not used.

This is only partially true - regional results will still be saved on S3.

offline=true

The AWS region in which the server is running, and in which you want to run worker machines.

aws-region=eu-west-1

The port on which the server will listen for connections from clients and workers.

server-port=7070

A temporary location to store scratch files. The path can be absolute or relative.

This allows you to locate temporary storage on an extra drive in case your main drive does not have enough space.

local-cache=/home/ec2-user/cache

local-cache=cache

This is the private IP address of the EC2 instance where the broker is running.

Instances have a seprate public and private network interface. We want the broker bound only to

the private one so that it is not accessible on the public Internet.

The broker is running on the same instance as the main backend, which is visible on the internet.

Internet since it also runs the frontend server)

server-address=0.0.0.0

Java threads for lighter async operations

light-threads=3

Java threads for heavy operations

heavy-threads=3

Max number of instances to start.

If there are more than this number running more instances will not be started.

This limit doesn't work very well because if you've manually started 200 workers on one graph,

the broker then won't start more workers for a completely different job.

max-workers=8

IAM role to assign the worker instances. Currently this is the same role assigned to the backend/broker.

This is the IAM role whose policy is defined in iam.yml (and is recursively referenced therein).

worker-iam-role=arn:aws:iam::abcdef123456

Type of EC2 instance to start up, in the standard AWS format containing a dot as listed on

https://aws.amazon.com/ec2/instance-types/

worker-type=r5.xlarge

The AWS Cloudwatch log group for worker EC2 instances.

Each worker creates its own log stream within this log group at startup.

worker-log-group=123456abcdef

The port on which the workers will listen for high priority pushed tasks.

This should be different from the server-port if you want to run a worker on the same machine as the server.

worker-port=7080

The ID of the AWS VPC subnet where EC2 worker instances will run.

Currently this is the same subnet where the backend and broker run.

The CIDR mask of this subnet naturally limits the number of workers, providing

some level of protection against high EC2 bills should workers accidentally proliferate.

worker-subnet-id=subnet-123456abcdef

The ID of the EC2 machine image for the workers. We use a stock Amazon Linux instance and perform additional

installation and configuration on startup using a user-data script.

worker-ami-id=ami-e5083683

javandres commented 4 years ago

Hi @arunaxp I am implementing this tool too, Apparently to load the set of opportunities and analysis results you need a s3 bucket, I have tested and it works correctly, you need to have the environment variables with you aws user credentials:

AWS_ACCESS_KEY_ID AWS_SECRET_ACCESS_KEY AWS_DEFAULT_REGION

and the variables in analysis.properties with the name of the bucket: bundle-bucket = grid-bucket = results-bucket =

ansoncfit commented 4 years ago

Configuration should be simplified with v5.9.0. Closing this for now.