airdcpp-web / airdcpp-webclient

Communal peer-to-peer file sharing application for file servers/NAS devices
https://airdcpp-web.github.io
182 stars 33 forks source link

No results returned for Websocket POST search action #406

Closed rishighan closed 3 years ago

rishighan commented 3 years ago

I'm using the following code to perform a search:

export const search = (data: SearchData) => {
  const connection = SocketService.connect("admin", "p");
  connection.then(async (d) => {
    const instance: SearchInstance = await SocketService.post("search");
    // await sleep(10000);
    await SocketService.post<SearchResponse>(
      `${SearchConstants.INSTANCES_URL}/${instance.id}/hub_search`,
      data,
    );
    // await sleep(10000);
    const results = await SocketService.get(
      `search/${instance.id}/results/0/5`,
    );
    console.log("results", results);
    SocketService.disconnect();
  });
};

where data is:

getDCPPSearchResults({
                query: {
                  pattern: "H.P. Lovecraft",
                  file_type: "compressed",
                  extensions: ["cbz", "cbr"],
                },
                hub_urls: [
                  "nmdcs://hub1",
                  "dchub://dc",
                  "dchub://dc2",
                  "dchub://dc3",
                ],
                priority: 1,
              })

This yields:


3 POST search/14/hub_search 
Object { query: {…}, hub_urls: (4) […], priority: 1 }
​
hub_urls: Array(4) [ "nmdcs://piter.feardc.net:411", "dchub://dc", "dchub://dc1", … ]
​​
0: "nmdcs://hub1"
​​
1: "dchub://dc"
​​
2: "dchub://dc2"
​​
3: "dchub://dc3"
​​
length: 4
​​
<prototype>: Array []
​
priority: 1
​
query: Object { pattern: "H.P. Lovecraft", file_type: "compressed", extensions: (2) […] }
​
<prototype>: Object { … }
SocketLogger.ts:49:17
3 SUCCEEDED 
Object { query: {…}, queue_time: 0, queued_count: 1, search_id: "2175015610" }
​
query: Object { excluded: [], file_type: "compressed", pattern: "H.P. Lovecraft", … }
excluded: Array []
​​extensions: Array [ "cbz", "cbr" ]
​​file_type: "compressed"
​​max_size: null
​​min_size: null
​​pattern: "H.P. Lovecraft"
​​
<prototype>: Object { … }
​queue_time: 0
​queued_count: 1
search_id: "2175015610"
​
<prototype>: Object { … }

4 GET search/14/results/0/5 (no data)

4 SUCCEEDED 
Array []

results
Array []

Any idea why no search results are returned?

maksis commented 3 years ago

Possibly because those file extensions aren't included in the compressed file type. See https://adc.sourceforge.io/ADC-EXT.html#_sega_grouping_of_file_extensions_in_sch for the recognized file extensions for each file type. I assume that you won't receive any results even if you perform the same search from the UI with that file type?

rishighan commented 3 years ago

Yes, that is correct. However, for this payload:

{
  "query": {
    "pattern": "ubuntu",
    "file_type": "any",
    "extensions": [
      "iso"
    ]
  },
  "hub_urls": [
    "nmdcs://piter.feardc.net:411",
    "dchub://dc.fly-server.ru",
    "dchub://dc.elitedc.ru",
    "dchub://dc.kcahdep.online"
  ],
  "priority": 1
}

I still get:

[Info] Login succeed (main.bundle.js, line 9232)
[Log] 26 – "POST" – "search" – "(no data)" (main.bundle.js, line 9232)
[Log] 26 – "SUCCEEDED" – {current_search_id: "", expires_in: 1799996, id: 33, …} (main.bundle.js, line 9232)
{current_search_id: "", expires_in: 1799996, id: 33, owner: "session:2112411548", query: null, …}Object
[Log] 27 – "POST" – "search/33/listeners/search_hub_searches_sent" – "(no data)" (main.bundle.js, line 9232)
[Log] 27 – "SUCCEEDED" – "(no data)" (main.bundle.js, line 9232)
[Log] 28 – "GET" – "search/33/results/0/5" – "(no data)" (main.bundle.js, line 9232)
[Log] 28 – "SUCCEEDED" – [] (0) (main.bundle.js, line 9232)
[Log] ASDASDASDASDASDASDA – [] (0) (main.bundle.js, line 3039)
[Info] Disconnecting socket (main.bundle.js, line 9232)

[Log] ASDASDASDASDASDASDA – [] (0) (main.bundle.js, line 3039) is the results returned from the search.

rishighan commented 3 years ago

Following up, you set me on the right path, I removed the file_type and extensions keys from the query object and got results.

Still, this is a valid payload, yea?

query: {
   "pattern": "ubuntu",
   "file_type": "any",
   "extensions": ["iso"],
},

Expanding on my thoughts a little bit: I want to be able to, through the UI (settings or otherwise) restrict the search to just cbr and cbz extensions. I don't want the user to be presented with any other results, that is what AirDCPP Web UI is for.

maksis commented 3 years ago

[Log] ASDASDASDASDASDASDA – [] (0) (main.bundle.js, line 3039) is the results returned from the search.

Could you post the code how you performed the search? I assume that it's different from the opening post as you are now adding a listener for search_hub_searches_sent. Note that you can also launch the airdcppd daemon with the --cdm-client parameter that will output all received search results in the console, which might help with troubleshooting.

Expanding on my thoughts a little bit: I want to be able to, through the UI (settings or otherwise) restrict the search to just cbr and cbz extensions. I don't want the user to be presented with any other results, that is what AirDCPP Web UI is for.

The best way to do that is to send extensions: ["cbz", "cbr"] without any file_type.

However, since you are performing the searches in NMDC hubs, other users will return the first results matching the pattern (without taking the extension into account) and the file type filtering is only done locally for the received results. That is a protocol limitation and can't be avoided. If you would perform the same search in an ADC hub, other hub users would only return results that are relevant to you by also matching the file type, which would ensure that you won't miss any wanted results because the users happened to match "junk" before moving on to the wanted files (the number of search results returned by each user is limited).

maksis commented 3 years ago

Sorry, my previous comment about the extensions isn't fully correct. Looks like it's currently filtering the incoming NMDC results only by the pattern, so the extension list will get ignored. I could possibly add such filtering in the next version but currently you need to filter the results by yourself. Everything will work correctly in ADC hubs though as there is nothing to filter locally.

I've also improved API documentation for the extensions and file_type fields.

maksis commented 3 years ago

The latest application version in the develop branch will filter NMDC results by extension in case you want to test it. I also noticed that at least some of the hubs that you use for testing seem to forward searches quite poorly so that may also have something to do with the empty result list.

rishighan commented 3 years ago

Could you post the code how you performed the search? I assume that it's different from the opening post as you are now adding a listener for search_hub_searches_sent. Note that you can also launch the airdcppd daemon with the --cdm-client parameter that will output all received search results in the console, which might help with troubleshooting.

Sure, I was trying out a couple of things to get it to work. This is the most current iteration that works:

export const search = async (data: SearchData) => {
  await SocketService.connect("admin", "password");
  await sleep(10000);
  const instance: SearchInstance = await SocketService.post("search");
  await SocketService.post<SearchResponse>(
    `search/${instance.id}/hub_search`,
    data,
  );
  await sleep(10000);
  const results = await SocketService.get(`search/${instance.id}/results/0/5`);
  console.log("results", results);
  SocketService.disconnect();
  return results;
};

However, since you are performing the searches in NMDC hubs, other users will return the first results matching the pattern (without taking the extension into account) and the file type filtering is only done locally for the received results. That is a protocol limitation and can't be avoided. If you would perform the same search in an ADC hub, other hub users would only return results that are relevant to you by also matching the file type, which would ensure that you won't miss any wanted results because the users happened to match "junk" before moving on to the wanted files (the number of search results returned by each user is limited).

Gotcha. Didn't know that this was the case.

rishighan commented 3 years ago

Sorry, my previous comment about the extensions isn't fully correct. Looks like it's currently filtering the incoming NMDC results only by the pattern, so the extension list will get ignored. I could possibly add such filtering in the next version but currently you need to filter the results by yourself. Everything will work correctly in ADC hubs though as there is nothing to filter locally.

I've also improved API documentation for the extensions and file_type fields.

Filtering locally isn't a huge issue. I also assume that the results from ADC and NMDC hubs will be mixed with each other? There's no way for me to figure out when to trigger the filtering? I have to filter the results wholesale, correct?

rishighan commented 3 years ago

The latest application version in the develop branch will filter NMDC results by extension in case you want to test it. I also noticed that at least some of the hubs that you use for testing seem to forward searches quite poorly so that may also have something to do with the empty result list.

Sure thing, I will test it out. Currently, I use AirDCPP on an Unraid box. But I can try the develop branch out locally.

maksis commented 3 years ago

Sure thing, I will test it out. Currently, I use AirDCPP on an Unraid box. But I can try the develop branch out locally.

Also uploaded to http://web-builds.airdcpp.net/develop/

rishighan commented 3 years ago

Is there a way to run the dockerized version of the web client? I am running macOS Catalina

maksis commented 3 years ago

What do you mean? No Docker files are provided by this project.

rishighan commented 3 years ago

I am working off of https://github.com/gangefors/docker-airdcpp-webclient, and while it does have instructions to download the latest stable release, I don't know how to configure it to run the develop branch.

maksis commented 3 years ago

I recommend using the issue tracker of that project. What I see from the Dockerfile is that it always downloads the latest stable release: https://github.com/gangefors/docker-airdcpp-webclient/blob/743281e2206fc553ed94c0c0d30add98e6558402/Dockerfile#L4

maksis commented 3 years ago

Is the original issue with getting the search results resolved?

rishighan commented 3 years ago

I was getting to that; so there are 2 aspects:

  1. Results returned from ADC hubs: I tried the search with an ADC hub and I was able to retrieve results normally. Earlier, with the NMDC hubs, I had flaky behavior in that, for any search query requesting more than 5 results, it would return [] That is fixed after I queried an ADC hub.
  2. As for testing against the latest change you pushed, I will try the Docker approach. Not too familiar with it, but I'll just try changing the download URL in the Dockerfile and report back.

The original issue, which was not getting results, I think was addressed, so we can close this out.

maksis commented 3 years ago

Looks like there are instructions for building a different version at the end of the README: https://github.com/gangefors/docker-airdcpp-webclient#building-the-docker-image

Depending on how you want to implement your UI, it's also possible to be notified about every new search result that is being received (docs). That will allow you to display the results without any wait period (which is how the Web UI works).

rishighan commented 3 years ago

Gotcha, is this how you would do it for sockets based approach: https://airdcpp.docs.apiary.io/#reference/searching/event-listeners/grouped-result-added ? Also is the await sleep(x) just for illustration purposes?

maksis commented 3 years ago

Gotcha, is this how you would do it for sockets based approach: https://airdcpp.docs.apiary.io/#reference/searching/event-listeners/grouped-result-added ?

It depends on the use case. If you can wait and you only need the most relevant matches, fetch the list as you are doing now. If you want show all the received results to the user as soon as you possibly can, use the result listener instead.

Also is the await sleep(x) just for illustration purposes?

No it's not. It's required if you want to fetch the "complete" list of search results (instead of receiving the results as soon as they arrive). I've updated the introduction section in the Search API docs that will hopefully make it more clear:

Search matching in Direct Connect is a distributed operation: search queries are sent to other hub users who match will match them against their own shared items. If you are in active connectivity mode, they will send the possible search results directly to you via UDP (passive results are sent via hubs instead).

Processing the search query may be done by thousands of different clients all over the world with varying types of hardware and internet connection speeds, so it's far from an enterprise-level environment in terms of speed and reliability. This also means that you can't simply just post a search and receive the complete list of results instantly as part of the response. Instead, you'll need to create a search instance for the search that will take care of collecting results of that particular search and keeps them available to be fetched later.

There is no clear answer to the question that how long it is necessary wait for all the search results to arrive after all connected hubs have forwarded the search to their users, but generally it's good to wait as long as you reasonably can. The waiting period can usually be longer for background searches, while searches that were initiated by the user who's waiting to see the results usually provide a better user experience with shorter waiting periods (you may also use the listeners for newly received search results in such cases to eliminate the waiting period).

rishighan commented 3 years ago

Perfect. I am waiting 10 seconds before presenting the results, which is admittedly a shade too long for it to be deemed good UX.

I think I'll go with the subscription-based approach search_hub_searches_sent