Closed CorentinB closed 4 months ago
I would like to work on this issue. How can I check if the crawling is working after the change?
Is it go run . get url https://some.url
Whatever URL I try I get
panic: runtime error: slice bounds out of range [:7] with length 6
can I work on this?
Hey guys, I like the enthusiasm a lot, but I wonder, where do you guys are coming from? We didn't have any contributors for years and out of the blue you are all popping up :) Where did you learn about Zeno?
@HarshNarayanJha I'm looking into that panic right now..
😆 Honestly I just went to goodfirstissues and found this repo. Intestingly this repo is owned by archive.org (lifesaver sometimes) so I decided to contribute
I had never known anything about Zeno before or even it existed!
😆 Honestly I just went to goodfirstissues and found this repo. Intestingly this repo is owned by archive.org (lifesaver sometimes) so I decided to contribute
I had never known anything about Zeno before or even it existed!
Sounds good! Well if you want to tackle that one, it's fine for me, be aware that we would need unit tests for whatever function you might add to replace the behavior of that library we are using.
I am assigning this issue to you.
(got distracted but I'm coming back to your panic asap..)
@HarshNarayanJha the panic is fixed: https://github.com/internetarchive/Zeno/pull/91
Wow! That was quick! So it was a minor hash/version slicing issue after all! I will update this tomorrow (9PM, gotta sleep soon)
@CorentinB Any Example URLs which have the link
header? I tried google.com, archive.org and even example.com, but they don't have the link
header
var (
links = linkheader.Parse(resp.Header.Get("link")) // <---- this right here
discovered []string
)
for _, link := range links {
discovered = append(discovered, link.URL)
}
It seems to me that it is a list of related URLs for the page. I probably can implement the parsing with the empty list, but it would be helpful otherwise
@CorentinB Any Example URLs which have the
link
header? I tried google.com, archive.org and even example.com, but they don't have thelink
headervar ( links = linkheader.Parse(resp.Header.Get("link")) // <---- this right here discovered []string ) for _, link := range links { discovered = append(discovered, link.URL) }
It seems to me that it is a list of related URLs for the page. I probably can implement the parsing with the empty list, but it would be helpful otherwise
I think you could try by writing a unit test first, so you would have a response with a link header!
You can check it now #95
github.com/tomnomnom/linkheader is being used to parse Link HTTP headers, pretty sure we can replace it with standard lib-only code.
Whatever code we add, we need unit test for it. (we don't have unit tests for that right now)