Closed LcodingL closed 4 years ago
Modify the settings.py of LogParser instead.
SCRAPYD_SERVER = '127.0.0.1:6800'
@LcodingL Has your problem been solved?
Hi, thanks for your help ! I will try it tomorrow and give you feedback ASAP.
Hi, sorry for the delay. For some reason i cannot have a try with your instruction.Later once i get the opportunity i will give you feedback.
Beside, i want to know how to store the actual host_ip in the Column url_scrapydweb
in Table metadata, Databases scrapydweb_metadata.
It is always http://127.0.0.1:5000 no matter how many times i altered to the actual host_ip manually .
Never manually modify the database scrapydweb_metadata. The value of url_scrapydweb in that database depends on SCRAPYDWEB_BIND.
Hi~
Ive modified SCRAPYD_SERVER to actual host_ip in the settings.py of LogParser and restart scrapydweb——I set ENABLE_LOGPARSER=True
in configuration of scrapydweb.
But the json_url
i got in API http://host_ip:6800/logs/stats.json
was still started with http://127.0.0.1:6800/
.
Can you test as follows:
Hi~
Ive followed those steps and the json_url in API http://host_ip:6800/logs/stats.json
still starts with http://127.0.0.1:6800/
Below is the logs of logparser:
[2019-11-26 12:56:03,029] INFO in logparser.run: LogParser version: 0.8.2 [2019-11-26 12:56:03,030] INFO in logparser.run: Use 'logparser -h' to get help [2019-11-26 12:56:03,030] INFO in logparser.run: Main pid: 14440 [2019-11-26 12:56:03,030] INFO in logparser.run: Check out the config file below for more advanced settings.
Loading settings from /root/Envs/spider_py3.6/lib/python3.6/site-packages/logparser/settings.py
[2019-11-26 12:56:03,033] DEBUG in logparser.run: Reading settings from command line: Namespace(delete_json_files=False, disable_telnet=False, main_pid=0, scrapyd_logs_dir='/root/logs', scrapyd_server='10.3.64.153:6800', sleep=10, verbose=False) [2019-11-26 12:56:03,033] DEBUG in logparser.run: Checking config [2019-11-26 12:56:03,033] INFO in logparser.run: SCRAPYD_SERVER: 10.3.64.153:6800 [2019-11-26 12:56:03,033] INFO in logparser.run: SCRAPYD_LOGS_DIR: /root/logs [2019-11-26 12:56:03,033] INFO in logparser.run: PARSE_ROUND_INTERVAL: 10 [2019-11-26 12:56:03,034] INFO in logparser.run: ENABLE_TELNET: True [2019-11-26 12:56:03,034] INFO in logparser.run: DELETE_EXISTING_JSON_FILES_AT_STARTUP: False [2019-11-26 12:56:03,034] INFO in logparser.run: VERBOSE: False
Visit stats at: http://10.3.64.153:6800/logs/stats.json
Can you run ‘logparser --delete_json_files’ and post the content of http://10.3.64.153:6800/logs/stats.json
{ status: "ok", datas: { 2019Phase1: { gxb_1: { task_1_2019-11-22T20_00_00: { log_path: "/root/logs/2019Phase1/gxb_1/task_1_2019-11-22T20_00_00.log", json_path: "/root/logs/2019Phase1/gxb_1/task_1_2019-11-22T20_00_00.json", json_url: "http://127.0.0.1:6800/logs/2019Phase1/gxb_1/task_1_2019-11-22T20_00_00.json", size: 4703, position: 4703, status: "ok", pages: 12, items: 2, first_log_time: "2019-11-22 20:00:11", latest_log_time: "2019-11-22 20:00:13", runtime: "0:00:02", shutdown_reason: "N/A", finish_reason: "finished", last_update_time: "2019-11-22 20:00:18" }} }, settings_py: "/root/Envs/spider_py3.6/lib/python3.6/site-packages/logparser/settings.py", settings: { scrapyd_server: "10.3.64.153:6800", scrapyd_logs_dir: "/root/logs", parse_round_interval: 10, enable_telnet: true, override_telnet_console_host: "", log_encoding: "utf-8", log_extensions: [ ".log", ".txt" ], log_head_lines: 100, log_tail_lines: 200, log_categories_limit: 10, jobs_to_keep: 100, chunk_size: 10000000, delete_existing_json_files_at_startup: false, keep_data_in_memory: false, verbose: false, main_pid: 0 }, last_update_timestamp: 1574746025, last_update_time: "2019-11-26 13:27:05", logparser_version: "0.8.2" }
Please delete stats.json and run ‘logparser --delete_json_files’ for the first time.
Great! It does work! Thanks a lot for your help and patience! Have a nice day~
Describe the need I want to call the API
http://host_ip:6800/logs/stats.json
to get all the jobs'json_url
and request them externally .But all json_urls I got were started withhttp://127.0.0.1:6800/
which were not the actual host_ip.I set as below at the beginning .
But things didn't change after I changed the configuration as below and restarted the scrapydweb.
And the Column
url_scrapydweb
stored in Table metadata, Databases scrapydweb_metadata is alwayshttp://127.0.0.1:5000
no matter how many times i altered to the actual host_ip manually .Screenshots Since i could not upload the screenshots successfully, i just paste the returned json data below:
Environment (please complete the following information):