Closed MicheleRoar closed 6 years ago
I totally agree on the fact that there should be the same structure for the resulting dataset. Anyway I would like to discuss a little bit more what is the best structure.
I would propose to use the
/path/to/result/
S_00000.gdm
S_00000.gdm.meta
...
version because it seems the clearer one, but I would like to have also the opinion of @marcomass and @Sim1Pall8a because this modification affects (maybe) also the R API.
Please tell us your opinion.
This issue was addressed also in GMQL issue #87 https://github.com/DEIB-GECO/GMQL/issues/87 My opinion is the following: to standardize the three modalities described by Michele we nee to consider the requirements of each of them. 1) Let's start from the "downloading the results of a query done using the web interface". In this case the structure has been defined by Arif (who I include here @acanakoglu) since it is required to download all included files and separate the sample and schema files from the others. The case 2), storing from API, generates a very similar structure, with only the subdirectory named exp/ instead of files. The case 3) pyGMQL does not include the subdirectory (since it only provides sample files).
If we adopt structure 3) for all cases, in cases 1) and 2) we would mix sample files with the other files, which I think it is much better to avoid (as Arif decided). So I would adopt 1) or 2) and since we have around several datasets already with structure 1), I would adopt it, just changing the subdirectory name in the API (see GMQL issue #87 https://github.com/DEIB-GECO/GMQL/issues/87 )
If we agree with this, who can do the API change? And together, the harmonization of the schema file name (as well indicated in GMQL issue #87 https://github.com/DEIB-GECO/GMQL/issues/87 ), i.e. close the GMQL issue #87 ?
I discuss with Luca, and we decided as below. I took the zip structure as base structure, and we will correct the others with respect to that one.
If it is not clear please let me know.
Case 1 can be done @andreagulino or by me. case 3 should be done by @lucananni93 and @Sim1Pall8a
@acanakoglu What is the situation of this issue?
------------ Comment added by Luca for clarity (PLEASE ADD BETTER DESCRIPTION NEXT TIME) ------------
When a query is performed locally with a statement like the following
The results are stored in the
/path/to/result/
path with the following structure:while when downloading the results of a query done using the web interface the result structure is:
and finally, when downloading the results of a remote query using the library the structure is the following:
There must be a coherency between all the ways of downloading or generating datasets.