Closed csmcallister closed 5 years ago
Also, got this error from the logs:
[ERROR] Exception occurred getting file size with redirected HEAD request from
https://www.fedconnect.net/FedConnect/?doc=28321319RI0000037\u0026agency=SSA:
Invalid URL '/FedConnect/default.aspx?
ReturnUrl=%2fFedConnect%2f%3fdoc%3d28321319RI0000037%26agency%3dSSA\u0026doc=28321319RI0000037\u0026agency=SSA':
No schema supplied. Perhaps you meant
http:///FedConnect/default.aspx?ReturnUrl=%2fFedConnect%2f%3fdoc%3d28321319RI0000037%26agency%3dSSA\u0026doc=28321319RI0000037\u0026agency=SSA?\
@sbchrist Note that aca9a11 fixed the invalid HEAD request error mentioned above.
closed by af1d69c 🎉
FedConnect is a non-gov complement to fbo.gov and grants.gov. Some agencies apparently refer to them when linking their solicitation docs on fbo.gov
Expected Behavior
Script detects a FedConnect url, handles the redirect, and then scrapes the attachment urls.
Current Behavior
Script currently detects the redirect in
get_fbo_attachments.FboAttachments.size_check()
but then fails to handle it. Problem there is that there shouldn't be a return statement in the condition scope:But even if it handled the redirect, it would be redirected to a FedConnect page that will require some scrape logic to get the attachment(s).
Possible Solution
neco.navy.mil
attachments, fbo.gov will redundantly link to the FedConnect site if there are multiple solicitation documents. Also, be aware that the FedConnect document hrefs use__doPostBack()
to make a POST request to get the doc. Getting the docs will likely require usingrequests
to make that POST request and the subsequent GET request given the POST response.Steps to Reproduce (for bugs)
From the logs:
Context
It's unclear what proportion of FBO solicitations use FedConnect to host their docs, so the effects of not making this fix might be negligible.