Open displague opened 1 year ago
Same issue here with Confluent Kafka Provider.
scraper: error: Failed to scrape Terraform provider metadata: cannot scrape Terraform registry: failed to scrape resource metadata from path: ../.work/confluentinc/confluent/docs/resources/confluent_ksql_cluster.md: failed to find the prelude of the document using the xpath expressions: //text()[contains(., "description") and contains(., "page_title")]
I'm also hitting this issue, is there a way around it?
Hi @displague,
Could you please try overriding the default value of the --prelude-xpath
command-line argument in apis/generate.go
with something like:
//go:generate go run github.com/upbound/upjet/cmd/scraper -n ${TERRAFORM_PROVIDER_SOURCE} -r ../.work/${TERRAFORM_PROVIDER_SOURCE}/${TERRAFORM_DOCS_PATH} -o ../config/provider-metadata.yaml --prelude-xpath "//text()[contains(., \"subcategory\")]"
@ulucinar This seems to have unblocked the build. I do see new code documentation bugs after making this change. Some lines are duplicated and some lines are pulled from the wrong section of the Terraform docs.
Hi @displague,
We had a recent change in the scraper that was motivated by fixing a case in upbound/provider-gcp. I don't expect it to address the duplication or the wrong-section issue you mentioned above but still, if you would like to give it a try, you may do so by updating your upjet dependency and adding the optional command-line argument --resource-prefix equinix
to the generate comment like we do here. Sorry for the inconvenience. It's quite challenging for one scraper to be able to handle all the cases.
Getting this issue trying to generate from the Fastly provider. It is erroring on empty markdown files.
The only workaround I have at the moment is to delete these files.
What happened?
While investing scraper for https://github.com/crossplane-contrib/provider-jet-equinix/pull/18 and processing https://raw.githubusercontent.com/equinix/terraform-provider-equinix/master/docs/resources/equinix_ecx_l2_connection.md,
According to https://developer.hashicorp.com/terraform/registry/providers/docs#yaml-frontmatter,
page_title
is optional.Does scraper know how to handle the Markdown documents and formatting described here? https://developer.hashicorp.com/terraform/registry/providers/docs#format
How can we reproduce it?
Checkout https://github.com/crossplane-contrib/provider-jet-equinix/pull/18/commits/97eb823e70fd751ff0265128a6db4dfbad9d8909
run
make generate