management Web UI on localhost:4001 is not working #278

cmnstmntmn commented 8 months ago

Hey, great work first of all!

My spider works fine but the interface is not running. I tried to wire to Bandit Am i missing something?

The API is working


Also /new is working


But the management interface (index/list page), (this one), is not working.

Management Interface

Relevant code:

# lib/application.ex

defmodule MyApp.Application do
  # See https://hexdocs.pm/elixir/Application.html
  # for more information on OTP Applications
  @moduledoc false

  use Application

  @impl true
  def start(_type, _args) do
    children = [
      {Bandit, plug: Crawly.API.Router}

    # See https://hexdocs.pm/elixir/Supervisor.html
    # for other strategies and supported options
    opts = [strategy: :one_for_one, name: Netskope.Supervisor]
    Supervisor.start_link(children, opts)
# config/config.ex

import Config

config :crawly,
  closespider_timeout: 10,
  concurrent_requests_per_domain: 8,
  closespider_itemcount: 100,
  log_dir: "./tmp/spider_logs",
  log_to_file: true,
  start_http_api: true,

  middlewares: [
    {Crawly.Middlewares.UserAgent, user_agents: ["Crawly Bot", "Google"]}
  pipelines: [
    # An item is expected to have all fields defined in the fields list
    {Crawly.Pipelines.Validate, fields: [:url]},

    # Use the following field as an item uniq identifier (pipeline) drops
    # items with the same urls
    {Crawly.Pipelines.DuplicatesFilter, item_id: :url},
    {Crawly.Pipelines.WriteToFile, folder: "./tmp", extension: "jl"}


cmnstmntmn commented 8 months ago

I think i found the issue, i created a PR for it: https://github.com/elixir-crawly/crawly/pull/279

However, even if the interface is loading, now

Screenshot 2023-11-20 at 01 12 49

The list of spiders is empty ..

cmnstmntmn commented 8 months ago

Fixed via this PR

JonasGruenwald commented 4 months ago

I was also a bit confused because in the docs it is stated that the management interface is on by default


start_http_api? :: boolean() default: true


NOTE: It's possible to disable the Simple management UI (and rest API) with the start_http_api?: false options of Crawly configuration.

But in reality it looks to me like it's off by default:


oltarasenko commented 3 months ago

I think I have addressed the last issue, and now the api is enabled by default