Closed jagdevsingh9709 closed 10 months ago
Do you actually need to know if the PDF has content or do you just want to know if the source URL has content?
You could check the URL first:
srcResp, err := http.Get("https://example.com")
if err != nil {
logFunc.Error(err)
pending.abort(job.ID)
return nil, err
}
defer srcResp.Body.Close()
// you could check the content length (if available)
if srcResp.ContentLength == 0 {
err := errors.New("no content")
logFunc.Error(err)
pending.abort(job.ID)
return nil, err
}
srcBody, err := io.ReadAll(srcResp.Body)
if err != nil {
logFunc.Error(err)
pending.abort(job.ID)
return nil, err
}
// or the body
if len(srcBody) == 0 {
err := errors.New("no body")
logFunc.Error(err)
pending.abort(job.ID)
return nil, err
}
log.Println(string(srcBody)) //etc
pdfg, err := NewPDFGenerator()
if err != nil {
log.Fatal(err)
}
// pass the body as reader
pdfg.AddPage(NewPageReader(bytes.NewReader(srcBody)))
err = pdfg.Create()
if err != nil {
log.Fatal(err)
}
Or if you actually want to check the PDF it is now supported here, but you could use a PDF parser like https://github.com/dslipak/pdf
Is there any way to identify that the go-wkhtmltopdf library has created/generated a blank pdf? We are using dynamic html page url to create a PDF. The dynamic html page may or may not have content. If it has content then a PDF will be generated which is working fine but if the html content is empty then a blank pdf is being generated(with our headers and footers which we had set in the page instance). We need to check if the generated PDF has blank/empty body content then create a kafka message and send it to retry topic.