VEuPathDB / EdaNewIssues

0 stars 0 forks source link

Mbio alpha div: Unable to fetch all required data #611

Closed asizemore closed 1 year ago

asizemore commented 1 year ago

Works for other studies, and even works when i change either the data or the method of the computation and keep the viz params the same. So maybe the computation returns some funky data?

Screen Shot 2023-03-31 at 10 24 28 AM
asizemore commented 1 year ago

Also seen on FARMM

Screen Shot 2023-03-31 at 2 46 16 PM
asizemore commented 1 year ago

Also Bangladesh Class, Shannon.

Looks like the merging service is upset about a header. The following is from the merging service logs:

Tabular subsetting result of type 'EUPATH_0000813' contained unexpected header.
Expected:entity_16SRRNAV4Assay_stable_id,Sample_stable_id,ParticipantRepeatedMeasure_stable_id,Participant_stable_id,alphaDiversity
Found   : entity_16SRRNAV4Assay_stable_id,Sample_stable_id,Prm_stable_id,Participant_stable_id,alphaDiversity
    at org.veupathdb.service.eda.ms.core.stream.EntityStream.beginValidatedInput(EntityStream.java:77) ~[service.jar:3.0.0]
    at org.veupathdb.service.eda.ms.core.stream.EntityStream.<init>(EntityStream.java:57) ~[service.jar:3.0.0]
    at org.veupathdb.service.eda.ms.core.stream.RootEntityStream.lambda$new$5(RootEntityStream.java:46) ~[service.jar:3.0.0]
    at java.util.Optional.map(Optional.java:260) ~[?:?]
...

@d-callan or @ryanrdoherty does this error look familiar?

d-callan commented 1 year ago

Was this data reloaded recently or something? Why in the one case does the participant repeat measures nodes have an id that spells it out and in the other case it says 'prm'?

asizemore commented 1 year ago

@jaycolin do you know if these studies were reloaded recently?

jaycolin commented 1 year ago

Not recently, no. I'm puzzled by the error above, EUPATH_0000813 = 16s Array entity, the only table I would expect to find "ParticipantRepeatedMeasure_stable_id" is EDA.ANCESTORS_BANGLADESH_HEALTHY_5YR_1_ENTITY_16SRRNAV4ASSAY And that table does have the correct column names. The precursor tables EDA.EntityType and EDA.EntityTypeGraph also show "ParticipantRepeatedMeasure". I can't find anywhere that "Prm" was used (and that would be a loading issue).

aurreco-uga commented 1 year ago

the header was in minio as part of the compute job id output_data file. the valid header is in the database (eda.ENTITYTYPE). we need a better way to cleanup invalid compute data in minio/postgres, but for now we can manually fix one by one.. (modify and upload)

asizemore commented 1 year ago

Thanks all! I appreciate it!