GLYCAM-Web / website

A bare-bones repo to contain public website stuff and issues related to the GLYCAM Web Django apps.
4 stars 0 forks source link

project.zip does not preserve symbolic links - might want to add project.tar #32

Open Lachele opened 3 years ago

Lachele commented 3 years ago

Tool: http://172.26.0.2/cb (local DevEnv)

Bug Description: When the project.zip archive is uncompressed, there are no symbolic links. This means two things: 1. It isn't possible to tell which of the conformers is the default. 2. One of the conformer directories is duplicated entirely, which wastes space.

When I look at the directory structure on the back-end, it occurs to me that we probably need to retool how we do our symbolic links. See Additional context.

To Reproduce: Build any multi-conformer glycan using any of the oligosaccharide modeling front end. Choose to build at least one of the available conformers. On the downloads page, click "Download All Structures". Unzip the file. Try to figure out which conformer is the default.

Expected behavior I expect the top-level 'default' object to be a symbolic link to one of the other directories.

See also Additional context, below.

Additional context Adding symbolic links might cause trouble for some users if their filesystem doesn't support symbolic links. Most *NIX systems should be fine. It will be worthwhile to test on Windows and Mac.

We will need to employ some fancy compression for returning files to users. In the set below, 'default' points to something outside its directory level. The others do, too. The user will need the non-default directories to contain actual data, but the 'default' directory should symlink to one of the other directories at that level. Many apologies for the annoyance you will probably feel trying to correct this.

├── Requested_Builds │   ├── 5ogg_6ogg -> ../Existing_Builds/5ogg_6ogg │   ├── 5ogg_6ogt -> ../Existing_Builds/5ogg_6ogt │   ├── 5ogt_6ogg -> ../Existing_Builds/5ogt_6ogg │   ├── 5ogt_6ogt -> ../Existing_Builds/5ogt_6ogt │   └── default -> ../Existing_Builds/default

gitoliver commented 3 years ago

1. It isn't possible to tell which of the conformers is the default. If they care they can figure it out. Or if we get enough requests about it, we can rename the pdb or add a table of info. 2. One of the conformer directories is duplicated entirely, which wastes space. It's needed to show a default structure, which may be where the user stops, before any conformers are generated. One could replace it or link to it, but the current setup would require a re-work to do this.

We will need to employ some fancy compression for returning files to users. This sounds separate from everything else. Dan handles compression and serving.

In the set below, 'default' points to something outside its directory level. The others do, too. Hmm didn't realize you were getting a zip full of links, we will have to "resolve" the links first and put all the files into one folder that gets zipped. The actual files can be in different projects. i.e. the Existing_Builds folder also contains links.

Lachele commented 3 years ago

I think I didn't make myself clear. I'm not getting links in the zip file. I want at least one link in the zip file. I want the link to show what the default structure is. That is, I want it to look like this:

$ ls -l total 2588 drwxrwxr-x 3 lachele lachele 4096 Mar 5 03:30 5ogg_6ogg drwxrwxr-x 2 lachele lachele 4096 Mar 5 03:04 5ogg_6ogt drwxrwxr-x 2 lachele lachele 4096 Mar 5 03:04 5ogt_6ogg drwxrwxr-x 2 lachele lachele 4096 Mar 5 03:04 5ogt_6ogt lrwxrwxrwx 1 lachele lachele 9 Mar 5 03:05 default -> 5ogg_6ogg -rw-rw-r-- 1 lachele lachele 2632335 Mar 5 03:04 project.zip

gitoliver commented 3 years ago

default.pdb gets built when the first request for information about a sequence is made. I can name the PDB file whatever the rotamer is at that point, which is probably the least painful, but Dan may need to adjust to use whatever name is being pointed to, rather than hardcoding to show "default.pdb". In the second request the user may or may not request the rotamer we use as the default. i.e. in the example above our default is 5ogg_6ogg, but the user may not request that rotamer. So if you want to symlink in cases where the user requests the default, and provide a separate PDB file in cases where they do not, that requires code to figure that out. I.e. in the list of user selections, you don't just build them all, you have to figure out which one (if any) is the default and skip it.

Lachele commented 3 years ago

Can there be an entry in some appropriate log file that says something like CONFORMER_LABEL=5ogg_6ogg ? Then, it will be easy to figure out by whomever and whenever.

If the user doesn't request the default structure, then the builder needs some way to know to not return the "default" folder when the user clicks "Download All". Having the label info live inside the directories seems reasonable.

Lachele commented 3 years ago

If that info could come back in the JSON response, too, I think Dan would be pleased. Also any other users. But, I think it needs to be in the directory, too.

danwentworthart commented 3 years ago

The json response does currently give me a conformer label field, but it is a duplicated value of the conformerID. The value just needs to be the label you want.

Lachele commented 3 years ago

I think I meant conformerID. Sorry. :-) I mean the string that is used to name the directory for that conformer's build.

I just checked the contents of amber_submission.json, and that sort-of contains the info, but you have to dig it out of a directory path. It would be kind to the user to drop that info into a file within the directory. Having the info inside the directory is good because directory names might get changed.

This might be a v2 thing, but it would also be useful for the 'default' directory to be a symlink to the conformer ID of the conformer that it is a duplicate of, instead of being a duplicate.

gitoliver commented 3 years ago

Having the user select whether or not they want default is another feature we can add to the list. Once we have a list that includes user feedback from v1, we can prioritize.

Looking back over everything here I don't see anything I think is necessary for v1.

Lachele commented 3 years ago

I agree that this is V_2.

cexum commented 1 year ago

We suspect this issue will resolve after the work Lachele is doing in GEMS. When that is done get details to Dan.

gitoliver commented 4 months ago

This doesn't seem to be happening yet: DManpa1-6[DManpa1-2DManpa1-3]DManpa1-6[DManpa1-3]DManpb1-4DGlcpNAcb1-4DGlcpNAcb1-OH.zip