Open stuchalk opened 5 months ago
@Adafede if you could give me some guidance on how to optimize the packages for this I could take a stab...
Hi @stuchalk as already said in #32, I do not really understand what you are trying to achieve with this, especially as this repository is unmaintained.
If I can help you achieve your goal without having to change unmaintained parts, happy to!
My goal is to install the lotus dataset locally as I am trying to generate semantic versions of the lotus data (combined with chembl data). The set of compounds that are in lotus and chembl will generate over 400,000 JSON-LD files that are representing the data semantically. To run code to generate this set takes a long time to run using the lotus website API, hence I want to install the data locally. My question above was about updating node/npm to be able to create the target folder. Currently I am getting an error due to the version of node that is being used which does not have a arm64 version for macOS. I was hoping to get help updating the configuration to allow me to run this on my M chip mac.
I would still insist this is not the best way to obtain the lotus data locally. Having an up to date mirror of what is on Wikidata (or Zenodo) would be much better. You can directly retrieve semantic version from there, maybe we should discuss it more in depth so we can best help you?
I have looked on Wikidata.org, but I am not sure where the lotus dataset is. Can you point me to it, or how to access it? If it is there in full, then it will likely work we for what I am doing.
I have looked on Wikidata.org, but I am not sure where the lotus dataset is. Can you point me to it, or how to access it? If it is there in full, then it will likely work we for what I am doing.
What is the minimum amount of information you would like to have? SMILES? taxon names? DOIs? Additional metadata?
We were using over 90% of the metadata fields in our code, so we would need to find where are the data is and filter out what we don't need from there...
I suspect we misunderstand each other a lot about the things we are talking about. We probably need to talk about it into more details. Or could you point me to the said code so I can better understand your needs?
Aren’t https://zenodo.org/records/7534077 and https://zenodo.org/records/7534071 covering most of (all?) your needs?
Let me check the Zenodo datasets and thank you for sharing them. I was not aware they were available...
OK, you can close this issue. I went back to the mongodb download and will use the main json file to get the data. Thanks for the help, and apologies for so many questions...
The data is heavily outdated but fine 😢
So, there is a newer version of this data? Which repo? If this is the case it might be a good idea to put a notification on this repo to point users to the newest version... I was assuming this was the only version...
No, there is no newer version of « this » data, as it became unmaintainable over time. Our data now lives on the other sources I mentioned above, this is why I did not recommend using the mongoDB. We still would like a better solution and are working on it still
Currently using https://nodejs.org/dist/v12.10.0/ there are no darwin-arm builds on the node website. The first version to have builds is https://nodejs.org/dist/latest-v16.x/node-v16.20.2-darwin-arm64.tar.gz. Can you verify this or a more recent version of node that will work correctly for building the docker image on Apple M architectures?