zowe / zowe-explorer-vscode

Visual Studio Code Extension for Zowe, which lets users interact with z/OS Data Sets, Unix System Services, and Jobs on a remote mainframe instance. Powered by Zowe SDKs.
Eclipse Public License 2.0
172 stars 92 forks source link

DatasetFSProvider readFile() fails to fetch contents of members with same parent when called by extenders #3267

Open benjamin-t-santos opened 3 hours ago

benjamin-t-santos commented 3 hours ago

Describe the bug

According to the FileSystemProvider wiki page, you can call vscode.workspace.fs.readFile(pdsMemberUri) to check if a member exists. If it does, its contents are returned. If it does not, a FileNotFound error is thrown. With the current implementation of readFile(), however, checking multiple members from the same PDS will incorrectly throw FileNotFound errors for every member after the first member is checked. Explanation of why in "Additional Context". To Reproduce

  1. Create two data set URIs that correspond to members in the same PDS (e.g. zowe-ds:/profile/PDS.NAME/MEMBER1, zowe-ds:/profile/PDS.NAME/MEMBER2
  2. Call vscode.workspace.fs.readFile with both URIs
  3. The call with MEMBER1 will have its contents returned, the call with MEMBER2 throws error: EntryNotFound (FileSystemError): zowe-ds:/profile/PDS.NAME/MEMBER2

Expected behavior

Contents for both members are returned.

Screenshots

Desktop (please complete the following information):

Additional context

  1. When readFile() is called, ZE starts by trying to find an existing DsEntry for the provided member URI.
  2. If one is not found, we get the entry of the parent of the provided member URI.
  3. If that entry is null, we fetch the resource from remote and create entries for the parent and the member URI (all done in fetchDataset())

All that behavior is okay. Now we call readFile() with a second member of the same PDS.

  1. readFile() is called, ZE starts by trying to find an existing DsEntry for the second member.
  2. No entry is found, so we get the entry of the parent of the provided member URI.
  3. The parent's entry is found, so we do not fetch the member from remote or create an entry from it (fetchDataset() is never called)
  4. The ds variable never gets set to an entry, so an error is thrown on line 401

Proposed solution: Do not always assume an entry for a member exists if its parent's entry exists. This is a fair assumption if you are using the tree views (I am assuming opening a PDS creates entries for its members), but the same guarantee does not exist for extenders.

Two potential options:

  1. Do not check for the parent's existence. If the data set does exist fetchDataset() will return without making any calls to remote and just return the value of this.lookup(uri)
  2. Let extenders add a fetch=true query for readFile() like we can for vscode.workspace.fs.stat(). When checking the parent, we can check for the fetch query in the same conditional. Something like if (parent == null || queryParams.has("fetch"))

Happy to open a PR, but wanted to open an issue to discuss a) whether we want to support this and b) what approach is preferred

github-actions[bot] commented 3 hours ago

Thank you for creating a bug report. We will investigate the bug and evaluate its impact on the product. If you haven't already, please ensure you have provided steps to reproduce the bug and as much context as possible.

traeok commented 2 hours ago

Hi @benjamin-t-santos,

If you call readDirectory with the PDS URI first, and then read one of its members using readFile, does this solution work for you?

We try to avoid fetching extra resources as a part of readFile, but we can expand it to fetch other PDS members during the remote lookup if needed.

benjamin-t-santos commented 2 hours ago

Calling readDirectory with the PDS URI first and then reading one of its members using readFile works, but that is not the approach that I want to take. A PDS can have 1000s of members - creating 1000s of entries is slow when I ultimately use readFile to get the contents for just one or two members I need.

I am not requesting readFile to fetch extra resources during the remote lookup - just the member I requested. It is my understanding that if the data set or member exists on the host, readFile() should return its contents, including fetching the requested resource and creating an entry for it if needed. Currently, it does not always do that.

traeok commented 2 hours ago

That's fair - in that case I think we should go with option 1.

As an alternative, you could call stat with fetch=true to populate the member and then use readFile to fetch its contents - but I still think this case w/ PDS members should be handled for consistency.

benjamin-t-santos commented 56 minutes ago

I was calling stat() + readFile() as you suggested originally, but just readFile() is a bit faster as it avoids the "stat is locating resource" step. When fetching lots of members, it can add up.