statonlab / hardwoods_site

Hardwoods Genomics bugs, data loading, and general issues
GNU General Public License v3.0
2 stars 1 forks source link

Tripal EUtils issues #488

Closed CaseyRichards92 closed 5 years ago

CaseyRichards92 commented 5 years ago

Link to get to EUtils https://hardwoods.ag.utk.edu/admin/tripal/loaders/eutils_ncbi_import Link for documentation for EUtils https://tripal-eutils.readthedocs.io/en/latest/introduction.html

CaseyRichards92 commented 5 years ago

Does not show analysis record, properties, or additional records fields so I could not view the metadata and linked records that will be inserted.

image

Also, the publication linked to the home page of the HWG site. Is that supposed to link there?

bradfordcondon commented 5 years ago

clarification- it does show those things if they are present: the NCBI project doesnt include linked records or additional metadata :P . In the case of this project i think the SRA is the record that would help (we dont currently support SRA) https://www.ncbi.nlm.nih.gov/sra?LinkName=bioproject_sra_all&from_uid=429376

However you can see that hte project itself just doesnt have a lot of info, and doenst link to the biosampale OR the SRA itself:

<DocumentSummary uid="429376">
    <Project>
        <ProjectID>
            <ArchiveID accession="PRJDB4532" archive="DDBJ" id="429376"/>
        </ProjectID>
        <ProjectDescr>
            <Name>Mangifera indica</Name>
            <Title>Genome sequencing of mango (Mangifera indica) cultivar 'Irwin'</Title>
            <Description>This genome was sequenced to search and construct mango genomic DNA markers. Cultivar 'Irwin' is leading cultivar in Japan.</Description>
            <Publication id="10.1007/s11295-017-1192-2">
                <Reference/>
                <DbType>eDOI</DbType>
            </Publication>
            <ProjectReleaseDate>2018-01-10T00:05:47Z</ProjectReleaseDate>
        </ProjectDescr>
        <ProjectType>
            <ProjectTypeSubmission>
                <Target capture="eWhole" material="eGenome" sample_scope="eMonoisolate">
                    <Organism species="29780" taxID="29780">
                        <OrganismName>Mangifera indica</OrganismName>
                        <Supergroup>eEukaryotes</Supergroup>
                    </Organism>
                </Target>
                <Method method_type="eSequencing"/>
                <Objectives>
                    <Data data_type="eRawSequenceReads"/>
                </Objectives>
                <ProjectDataTypeSet>
                    <DataType>Genome sequencing</DataType>
                </ProjectDataTypeSet>
            </ProjectTypeSubmission>
        </ProjectType>
    </Project>
    <Submission last_update="2018-01-10" submitted="2016-01-26">
        <Description>
            <!-- Submitter information has been removed -->
            <Organization role="owner" type="center">
                <Name>Genome Unit, NARO Institute of Fruit Tree Science</Name>
                <!-- Contact information has been removed -->
            </Organization>
            <Access>public</Access>
        </Description>
    </Submission>
</DocumentSummary>

here's the project: https://www.ncbi.nlm.nih.gov/bioproject/?term=PRJDB4532

So on the web, you see the SRA/biosample, what g ives, right? They do extra searches on each DB to check if THOSE records refer the project, which we dont do and decided not to do... we could revisit though. In the case of assemblies, that information can be (but is not always) directly linked. i dont know if the same is true for SRA.

almasaeed2010 commented 5 years ago

I believe we decided not run eutils and hq on the live site. so closing.