Please check off the ones you did (even if you just started one, so someone else does not grab it). Prioritize those without links to other unported links, so we have fewer clashes. Think ahead about the file's category as this will affect its filename and id. Merging extremely short and terse but vague and/or prosey topics into detailed how-tos or more useful documents is encouraged. Less better content > more weaker content.
Accounts and Billing
[x] Is Diffbot compliant with GDPR?
[ ] Can I create additional / child tokens under my account?
[ ] Can I receive my invoice / receipt or billing history as a PDF?
[ ] How can I update my credit card details?
[ ] Does Diffbot offer manual invoicing, custom terms or other payment options?
[ ] What counts as an API call?
Automatic APIs
[ ] Fixing a misidentified page type with Analyze API
[ ] Can I send HTML or text directly to Diffbot APIs?
[ ] Which spoken languages (humanLanguage) are identified in Diffbot APIs?
[ ] Do Diffbot APIs cache responses?
[ ] How do I set custom headers in API calls or while crawling?
[ ] Using Diffbot Proxy Servers / Proxy IPs
[ ] The Analyze API “fallback” argument
[ ] How long can a single request take / what is the Diffbot API timeout?
[ ] Can Diffbot APIs Extract Content from PDFs or Other Documents?
[ ] Accessing Product Prices in Other Currencies with the Product API
[ ] Moving to New Versions of Diffbot APIs
[ ] Semantria-powered sentiment, entity-extraction and other text analysis features
[ ] When to use the Analyze API versus individual Automatic APIs
[ ] Do Diffbot APIs execute Javascript?
[ ] Do Diffbot APIs follow redirects?
[ ] Does Diffbot handle non-English pages?
[ ] Can Diffbot access content within an intranet or requiring a login?
[ ] What counts as an API call?
[ ] Improving API response times
[ ] How to concatenate multiple-page articles using a custom rule
[ ] Date normalization in the Article API
[ ] How Diffbot handles multiple-page articles and discussions
Bulk API Service
[ ] Too Many Collections Error
[ ] Are bulk processing URLs returned in the same order as submitted?
[ ] What does “all crawling temporarily paused by root administrator…” mean?
[ ] How do I set custom headers in API calls or while crawling?
[ ] Using Diffbot Proxy Servers / Proxy IPs
[ ] How quickly does the Bulk Service process web pages?
[ ] When is crawl or bulk job data deleted?
[ ] Using Zapier with Crawlbot or Bulk API jobs
[ ] Using the Crawlbot or Bulk API querystring parameter
Crawlbot
[ ] Does Crawlbot support authenticated crawling?
[ ] Too Many Collections Error
[ ] Can I limit processing to articles written before, after or between certain dates?
[ ] Can I spider multiple sites in the same crawl? Is there a limit to the number of seed URLs?
[ ] Can multiple Diffbot extraction APIs be used in a single crawl?
[ ] Can Crawlbot use a site map (or sitemap) as a crawling seed?
[ ] Can Diffbot crawl sites that use “infinite” or “endless” scrolling?
[ ] Why is my crawl not crawling (and other uncommon crawl problems)?
[ ] What does “all crawling temporarily paused by root administrator…” mean?
[ ] How do I set custom headers in API calls or while crawling?
[ ] Using Diffbot Proxy Servers / Proxy IPs
[ ] Does Crawlbot follow “hashtag” links / internal links / fragment identifiers?
[ ] When is crawl or bulk job data deleted?
[ ] How do I stop a “never-ending” crawl due to dynamic URLs or querystrings?
[ ] How are repeating/recurring crawls scheduled?
[ ] How to find and access Ajax-generated links while crawling
[ ] How does Diffbot handle duplicate pages/content while crawling?
[ ] How can I check how many articles, products or other pages have been found?
[x] How can I limit the depth of my crawl?
[x] Which regular expression standard / syntax does Crawlbot use?
[ ] How can I crawl (news) sites and monitor/extract only recent content?
[ ] In a recurring crawl, how do I download only the latest round’s content?
[ ] How long does it take to crawl a site?
[x] Crawl and Processing Patterns and Regexes
[x] Will Crawlbot spider across domains or subdomains?
[ ] Using Zapier with Crawlbot or Bulk API jobs
[ ] Do Diffbot APIs execute Javascript?
[ ] Do Diffbot APIs follow redirects?
[ ] Does Crawlbot respect the robots.txt protocol?
[ ] Using the Crawlbot or Bulk API querystring parameter
[x] What’s the difference between crawling and processing?
Custom API Toolkit
[ ] Injecting JavaScript into Custom API and replaying AJAX calls
[ ] Fixing a misidentified page type with Analyze API
[ ] Using the Replace Filter in the Custom API Toolkit
[ ] Backing up and restoring custom APIs and rules
[x] Accessing Data Which Requires a Login
[x] Custom API Preview Failing: How to Build a Custom Field Manually
[ ] How to correct Article, Product, or other API output with a custom rule
[ ] How do custom APIs handle different templates?
[ ] Can I create multiple custom rules for a single site?
[ ] What happens when a custom rule “breaks?”
[ ] How do I correct the ‘images’ field in the Article API?
[ ] Concatenating multiple pages with a custom API
[ ] Transitioning Custom Rules to Updated API Versions
[ ] Regular Expressions in the API Toolkit
[ ] Help with custom collections
[ ] How can I access META elements using the Custom API Toolkit?
[ ] Do Diffbot APIs execute Javascript?
[ ] Do Diffbot APIs follow redirects?
[ ] Can Diffbot access content within an intranet or requiring a login?
[ ] How to concatenate multiple-page articles using a custom rule
[x] Why is a web page preview sometimes mis-formatted (or invisible)?
[ ] How can I use CSS selectors to select multiple items?
Please check off the ones you did (even if you just started one, so someone else does not grab it). Prioritize those without links to other unported links, so we have fewer clashes. Think ahead about the file's category as this will affect its filename and
id
. Merging extremely short and terse but vague and/or prosey topics into detailed how-tos or more useful documents is encouraged. Less better content > more weaker content.Accounts and Billing
Automatic APIs
Bulk API Service
Crawlbot
Custom API Toolkit
Errors
Global Index