ucscGenomeBrowser / kent

UCSC Genome Browser source tree. Stable branch: "beta".
http://genome.ucsc.edu/
Other
217 stars 84 forks source link

Bug : URL redirection does not seem to work properly in UDC #53

Closed sanchit-saini closed 3 years ago

sanchit-saini commented 3 years ago

Hello,

rtracklayer relies on the kent library and It seems UDC(URL data cache) does not handle URL redirection properly that is causing https://github.com/lawremi/rtracklayer/issues/42 issue.

I tested udcFileMayOpen function with "http://bedbase.org/api/bed/78c0e4753d04b238fc07e4ebe5a02984/file/bigbedfile" For ease of testing, I inserted the following code segment in the src/utils/bedToBigBed utility (in the usage function).

void usage()
/* Explain usage and exit. */
{
struct udcFile *udcTestFile =
udcFileMayOpen("http://bedbase.org/api/bed/78c0e4753d04b238fc07e4ebe5a02984/file/bigbedfile", udcDefaultDir());
if (udcTestFile == NULL)
    printf("Not Working\n");
else
    printf("Working\n");
...

After compilation received Not working as output. This function udcInfoViaHttp seems related to the URL direction. I would be happy to help.

Thanks!

genome-www commented 3 years ago

Hi there,

The issue here is the server is not responding properly to HEAD requests, as evidenced below:

curl -I http://bedbase.org/api/bed/78c0e4753d04b238fc07e4ebe5a02984/file/bigbedfile

HTTP/1.1 404 Not Found Server: nginx/1.19.6 Date: Tue, 09 Mar 2021 15:48:17 GMT Content-Type: text/plain; charset=utf-8 Content-Length: 9 Connection: keep-alive

If the server responds with the proper redirect HTTP status code then this should work fine.

Please let us know if you have any further questions.

Christopher Lee UCSC Genomics Institute

On Tue, Mar 9, 2021 at 5:41 AM 'Sanchit Saini' via UCSC Genome Browser Confidential Support genome-www@soe.ucsc.edu wrote:

Hello,

rtracklayer relies on the kent library and It seems UDC(URL data cache) does not handle URL redirection properly that is causing lawremi/rtracklayer#42 issue.

I tested udcFileMayOpen function with "http://bedbase.org/api/bed/78c0e4753d04b238fc07e4ebe5a02984/file/bigbedfile" For ease of testing, I inserted the following code segment in the src/utils/bedToBigBed utility (in the usage function).

void usage() / Explain usage and exit. / { struct udcFile *udcTestFile = udcFileMayOpen("http://bedbase.org/api/bed/78c0e4753d04b238fc07e4ebe5a02984/file/bigbedfile", udcDefaultDir()); if (udcTestFile == NULL) printf("Not Working\n"); else printf("Working\n"); ...

After compilation received Not working as output. This function udcInfoViaHttp seems related to the URL direction. I would be happy to help.

Thanks!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or unsubscribe.

-- To unsubscribe from this group and stop receiving emails from it, send an email to genome-www+unsubscribe@soe.ucsc.edu.

sanchit-saini commented 3 years ago

Okay, understood. Thanks, I never thought about questioning the server. I tested it with another URL. It worked as expected.

nsheff commented 3 years ago

Just to add for future reference, this is because bedbase is built using fastAPI, which does not provide HEAD responses automatically for GET endpoints. The issue is documented here: https://github.com/tiangolo/fastapi/issues/1773

In practice you don't always need the HEAD response to follow the URL redirect; browsers, wget for example will still work, it just depends on whether the requester relies on the HEAD response or doesn't -- but the UCSC code does require it. Anyway, we will solve it on our end for now and and per that issue, fastAPI should eventually support HEAD automatically.

genome-www commented 3 years ago

Yes, fastapi was mostly built for creating fast web APIs. HEAD requests are not important in this context.

On Wed, Mar 10, 2021 at 2:06 PM 'Nathan Sheffield' via UCSC Genome Browser Confidential Support genome-www@soe.ucsc.edu wrote:

Just to add for future reference, this is because bedbase is built using fastAPI, which does not provide HEAD responses automatically for GET endpoints. The issue is documented here: tiangolo/fastapi#1773 https://github.com/tiangolo/fastapi/issues/1773

In practice you don't always need the HEAD response to follow the URL redirect; browsers, wget for example will still work, it just depends on whether the requester relies on the HEAD response or doesn't -- but the UCSC code does require it. Anyway, we will solve it on our end for now and and per that issue, fastAPI should eventually support HEAD automatically.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ucscGenomeBrowser/kent/issues/53#issuecomment-795385493, or unsubscribe https://github.com/notifications/unsubscribe-auth/AQIUREEE6OOAMFIA2T36ODLTC5OC5ANCNFSM4Y3UC4QQ .