OpenSourceMalaria / OSM_To_Do_List

Action Items in the Open Source Malaria Consortium
82 stars 13 forks source link

Do we need a whip-round? #543

Open cdsouthan opened 7 years ago

cdsouthan commented 7 years ago

@MedChemProf hmmm - maybe we could sell some front-runner compounds to vendors to raise the cash for you? Would be sad if Google stopped indexing your good stuff

capture

MFernflower commented 7 years ago

I do not see why the subscription expiring would mess with google's indexing?

cdsouthan commented 7 years ago

I confess this was a tongue-in-cheek comment. I believe this ELN will be returned eventually to input-enabled status when the cheque gets into the post. However, indirectly it does supply an argument for free beer OS ELNs but this in turn depends on the balance of features that could make the investment worth it (and the results "open" either way)

drc007 commented 7 years ago

If a subscription lapses they could make a notebook read only.

MFernflower commented 7 years ago

it would be great if we had an archive server where people could archive things like there lab archive - should not need more than 500 gb for everything including some future proofing @mattodd @cdsouthan

mcoster commented 7 years ago

I don't have any previous experience with it, but could OSM contributors just use figshare?

MFernflower commented 7 years ago

@mcoster I was thinking of an FTP server hosted in a secure location where every-body in the project can get a private username and password (for adding and removing stuff ofc - everything could be google indexed!) - @mattodd How hard would this be to set up?

mcoster commented 7 years ago

I guess it depends on what people want to achieve - archiving suitable items on figshare would, I believe, promote visibility and may lead to wider engagement. An FTP server would provide a higher degree of control and flexibility, I presume.

MFernflower commented 7 years ago

FTP servers are not polished in anyway but indeed do provide total control over everything!!!! One can simply export all of their lab notebook as pdf files and upload them onto the server with minimal stress.

However, this lack of polish does not forgo us from getting google indexed!

A example of a FTP setup can be found here: ftp://ftp.knoppix.nl/pub/os/Linux/distr/knoppix-dvd/

drc007 commented 7 years ago

I have to say pdf are the bane of my life, all chemical structural information and spectral information is lost, read PMR blog https://blogs.ch.cam.ac.uk/pmr/2008/05/03/the-merits-and-demerits-of-pdf/.

Need to capture chemistry in a chemical file format, similarly spectroscopy as JCAMP or similar.

I suspect you are underestimating the volume of data, my work file store contains several terabytes of files. You also need to have backups, preferably in different geographic locations.

Or is this what LabArchives does?

cdsouthan commented 7 years ago

For the free and permanent archiving of basic stuff (e.g. chemical strings, activity data, assay descriptions etc) the low resistance path to Figshare would take some beating (e.g. my stuff https://figshare.com/search?q=southan&quick=1). OK so I know its Digital Science Big Brother in the background but hey. However, AWAK spectra and synthetic schema are very different kettles of fish. I have little cognisance about such but I have come across https://www.researchgate.net/project/The-Open-Spectral-Database and there is also some spectra captured in both ChemSpider and PubChem

drc007 commented 7 years ago

Is figshare academics only?

cdsouthan commented 7 years ago

AFAIK that's not a restriction - besides, I wont tell anyone @drc007 so long as that crate of wine arrives at least once a year...

mattodd commented 7 years ago

Thanks for raising this @cdsouthan . It's an important consideration. Since OSM can't mandate that people use a certain ELN, we will encounter this issue, e.g. if students use their own solution to contribute. We therefore need to ensure that the ELN content remains in the public domain in a searchable way (text and any appended data). The solution is either to be home-grown, or to use other services. We don't have the resources to commit to a home-grown solution at the moment - we can certainly set up a server in a few minutes, but we don't have the resources to guarantee that the server will be operational in 5 years' time. Instead solutions like Figshare present something useful - a free data deposition system that has a low barrier to entry and which has good guarantees of permanence (CLOCKSS) and support. I would recommend it. I had thought to recommend Dryad, but I now see that they charge.

The alternative is to use University repositories. These have similar guarantees of permanence. Those of us not affiliated directly with universities can I'm sure impress upon Uni colleagues the importance of archiving work. I would be happy to use USyd's repo for any OSM work, for example.

The key is to ensure (as @drc007 says) that the content is complete and searchable. We must do all we can to make sure of permanence. So if a lab notebook is no longer active we as a community need to remind creators of those lab notebooks to archive them (e.g. as xml files) and to upload them to some form of repo. The archived files need to be linked from the relevant OSM project wiki to ensure they are never orphaned.

MFernflower commented 7 years ago

I second Dr.Todd's idea of using the USyd Archives to store chase's lab notebook archives and any other people who need archiving for the OSM project

I never heard of the phrase "whip-round" until I saw this post - I had to type it into Mirriam-Webster online!

cdsouthan commented 7 years ago

Yet another ELN https://jcheminf.springeropen.com/articles/10.1186/s13321-017-0240-0

mcoster commented 7 years ago

Actually, that looks interesting. Any idea whether it is suitable for open science, ie. can you set permissions to allow any and all to view ELN entries? I've been watching eLabFTW, which is in very active development, but he hasn't implemented open accessibility of ELN entries yet.

drc007 commented 7 years ago

Claims to be Opensource, so in theory you could modify it appropriately. Perhaps contact them and and ask for demo and names of customers?

mcoster commented 7 years ago

Sounds good, I'll look into this on Monday. I like eLabFTW because the guy trying the show is uber open source (hosts on GitHub, etc), but it is a more general science ELN. He has open access to ELN on his roadmap, but no word when, and I have zero PHP skills to help out!

MFernflower commented 7 years ago

Maybe the ELabFTW guys could make us a custom version of their software pro-bono?

mcoster commented 7 years ago

I'll chase it up on the eLabFTW repo. In the meantime, I was considering doing a trial run of eLabFTW for my group, and looking into running some kind of script that could export ELN data to an open access webpage until there is legit open read access.

cdsouthan commented 7 years ago

While you'r at it @mcoster script up something to format a data sheet ready to go to PubChem BioAssay? (not auto submit but just ready for careful manual QC before it goes off)

NicolasCARPi commented 7 years ago

Hello everyone. I'm the main developer of eLabFTW, the free and open source ELN. If you wish to use eLab without any kind of authentication, I see several options:

Cheers, ~Nico

mcoster commented 7 years ago

Thanks for your input @NicolasCARPi - it sounds like there is some flexibility. I like the sound of the last workaround you suggested - the main aim is to allow anyone who wants to see experimental details to see them, while allowing individual experimenters to have edit access to their own experiments. Most Open Source Malaria contributors use either LabTrove or LabArchives based ELNs and share URLs to their experiments here on GitHub. One of the aims of OSM is for as much as possible of the science to be indexed by web services like Google, so that it is highly findable.

Still keen on proper implementation, if you are able to fit it into your (very active!) development schedule.

Best wishes,

Mark

mattodd commented 7 years ago

Interesting thread. Three quick things.

1) @mcoster is right in that the optimal solution will have experiments/data that are viewable without any kind of login, to maximise indexing by search engines and mimimise barriers to participation. Having to sign in to view anything is a red line.

2) Provenance is important, so a shared login where it's not clear who is doing what is also a no-no. A shared login (in order to post) where it is possible to see who has done what would be fine. (I may be misunderstanding you @NicolasCARPi )

3) The Chemnotion ELN that was linked to earlier looks nice (I saw it in action at a Beilstein meeting recently) and is, for the moment, supported by the host Uni that runs it. I'd like to flag up again @lpatiny 's C6H6 paper that just came out. For us here in Sydney the issue we'd need (Luc knows!) in order to implement the system more widely is structure-based search on all reactions. I'm fairly sure all chemists would want this, ultimately.

drc007 commented 7 years ago

@NicolasCARPi @mattodd @mcoster I've helped implement ELN several times and the things that always comes up is the choice of chemical editor. In my experience most chemists will refuse to use anything other than their preferred editor, so support for multiple editors is critical.

lpatiny commented 7 years ago

We have now the prototype of substructure search for reaction as well for www.c6h6.org on a test server. I will be released this month.

mcoster commented 7 years ago

I should point out that my group's needs and wants are no doubt different others. For me, the absolute, paramount, overriding, most important feature is usability. My experience so far is that if it isn't extremely easy to grok, and bug-free, I will struggle to get uptake and compliance, and no ELN use is much worse than use of a sub-optimal ELN. The students I am supervising for OSM projects at the moment are completely new to research. Everything is new to them, so it can be overwhelming. It took constant cajoling to get them to try the LabTrove ELN, and then for some reason we encountered bugs and the possibility of implementing an ELN slipped away for this term...

For our use case, I would be happy to just stick to pasted images for reaction schemes - seamless drag'n'drop from eg. MarvinSketch would be ideal. Most of these students are using chemical drawing programs for the first time, so they don't have a favourite. We don't have a site license for ChemDraw, so that can't be their default.

Likewise, drag'n'drop for other file types would be great - risk assessment (PDF), TLC plate (JPG), etc.

Intuitive table functionality for reaction stoichiometries.

That's about it. I really like the idea of more chemically-aware systems and I think they would be great for future PhD students and postdocs of mine, but the additional learning overhead that comes with these is too high for the project students.

cdsouthan commented 7 years ago

I had to look up "grok" but thats OK....

NicolasCARPi commented 7 years ago

FYI, eLabFTW implements ChemDoodle for chemical drawing, and you can drag and drop files to your experiment so they are uploaded. If your file is a molecule, it'll be displayed correctly. If it's a 3D structure, too thanks to 3Dmol.

I agree with @mcoster, usability is very important. If you need 15 clicks to get something done, then it's not usable.

Now I must say that you folks look very chemistry oriented, and elabftw is not chemistry oriented. It's everything oriented (including chemistry). So some advanced chemistry features might be missing.

Cheers, ~Nico

NicolasCARPi commented 6 years ago

Hello everyone, just to let you know, elabftw's next version (coming soon on your servers) will implement a way to have Anonymous visitors, with read access to all the 'Public' experiments (and database items if you let them).

Cheers, ~Nico

mcoster commented 6 years ago

Thanks Nico. I was excited to see those commits and very keen to try out the 'Public' access to experiments in ELABFTW!