How to configure scrapydweb to run on Ubuntu EC2 and display at ec2 ip address

cyclehacker commented 4 years ago

Thanks for the great project!

I've set up and run scrapydweb local and it work great. I'm now trying to set it up on an EC2 instance

What steps would I need to to take to deploy the app to the ip of the EC2 instance so that I can access it from my browser?

Many thanks,

my8100 commented 4 years ago

Maybe you can refer to #73

cyclehacker commented 4 years ago

Thanks, I managed to figure it out!

my8100 commented 4 years ago

Could you share your experience?

cyclehacker commented 4 years ago

It turns out I didn't manage to get it working.

I trying to run it on gunicorn like this: gunicorn -b 0.0.0.0:5000 scrapydweb.run:main

scrapyd is running on 127.0.0.1:6800 scrapydweb is set to run on 0.0.0.0:5000

I've tried multiple combinations of addresses but receive either site can't be reached or internal server errors.

I'm clearly missing something fundamental here.

my8100 commented 4 years ago

I am not sure whether scrapydweb could work well with gunicorn. Maybe you should test it locally first, or use nohup directly. You can use svr-6.herokuapp.com:80 as a scrapyd server for debugging.

cyclehacker commented 4 years ago

nohup is for running a unix job in the background, which I see might be useful once I get scrapyweb running in a production environment:

"WARNING: Do not use the development server in a production. " "Check out http://flask.pocoo.org/docs/1.0/deploying/"

Have you tested with any of the options on the above page? Or do you think it's not really going to work well in any of these environments?

caffeinatedMike commented 3 years ago

@cyclehacker the reason you're having issues running this app with gunicorn is because the function that you're calling is running the app instance instead of returning it for gunicorn to handle. You can tweak this by returning the app variable from the function and that should allow the app to work properly with gunicorn and any other server libraries.

Tobeyforce commented 3 years ago

@cyclehacker the reason you're having issues running this app with gunicorn is because the function that you're calling is running the app instance instead of returning it for gunicorn to handle. You can tweak this by returning the app variable from the function and that should allow the app to work properly with gunicorn and any other server libraries.

@caffeinatedMike Have you managed to deploy scrapydweb? I'm trying to return app but figuring out how Flask works is giving me headache.. @cyclehacker Did you succeed in switching over to a prod server? Would you mind sharing what you did?

Just returning app at the end did not work for me when running gunicorn -b 0.0.0.0:5000 scrapydweb.run:main

def main():
    apscheduler_logger.setLevel(logging.ERROR)  # To hide warning logging in scheduler.py until app.run()
    main_pid = os.getpid()
    logger.info("ScrapydWeb version: %s", __version__)
    logger.info("Use 'scrapydweb -h' to get help")
    logger.info("Main pid: %s", main_pid)
    logger.debug("Loading default settings from %s", handle_slash(DEFAULT_SETTINGS_PY_PATH))
    app = create_app()
    handle_metadata('main_pid', main_pid)  # In handle_metadata(): with db.app.app_context():
    app.config['MAIN_PID'] = main_pid
    app.config['DEFAULT_SETTINGS_PY_PATH'] = DEFAULT_SETTINGS_PY_PATH
    app.config['SCRAPYDWEB_SETTINGS_PY_PATH'] = os.path.join(os.getcwd(), SCRAPYDWEB_SETTINGS_PY)
    load_custom_settings(app.config)

    args = parse_args(app.config)
    # "scrapydweb -h" ends up here
    update_app_config(app.config, args)
    try:
        check_app_config(app.config)
    except AssertionError as err:
        logger.error("Check app config fail: ")
        sys.exit(u"\n{err}\n\nCheck and update your settings in {path}\n".format(
                 err=err, path=handle_slash(app.config['SCRAPYDWEB_SETTINGS_PY_PATH'])))
    return app

guishake commented 3 years ago

I think you've got to run gunicorn --bind 0.0.0.0:5000 "scrapydweb.run:main()"

my8100 / scrapydweb

How to configure scrapydweb to run on Ubuntu EC2 and display at ec2 ip address #131